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Recently developed high speed networks are capable of transmitting data at rates of 100 Mbps or more. One such network protocol is 
Fiber Distributed Data Interface (FDDI). This network has a physical transmission rate of 100 Mbps. Analytical and simulation studies have 
shown that the FDDI protocol should provide actual throughput of 80% to 95% of this physical rate. Can the end user expect to see this kind 
of performance? If not, then what kind of throughput can actually be expected and where are the bottle necks? 

In order to answer these and other related questions, two areas were studied: First, a performance comparison between a 40MHz 
SPARCstation 10 workstation and a SOMHz SPARCstation 10 workstation was conducted using the Neal Nelson commercial benchmurk 
tool. Next. a well-known network measurement tool, cp, was used to obtain data transfer rates while varying several tunable operating 
system and network parameters. The parameters varied were: Target Token Rotation Time, TCP/IP window size, NFS asynchronous threads. 
Logical Link buffer size and Maximum Transfer Unit size. The results from the.commercial benchmark analysis were used to determine if 
there are any differences which can affect wansfer rates between the two workstations. 

The results from the commercial benchmark tool clearly showed that the newer. “higher Speed processor is faster. The network tool tcp 
showed that the TCP/IP window size had the largest impact on throughput performance. Throughput more than doubles from a window size 
of 4k to a window size of 20k.This is followed by having more than one workstation (ansmitting data simultuncously. Having two 
workstations transmitting nearly halves throughput. This is followed by having a faster processor. A measurement of file ransters using rep 
system calls showed that the largest impact on file ransfer speed is the overhead of receiving the transferred file. 
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ABSTRACT 


Recently developed high speed networks are capable of transmitting data at rates of 
100 Mbps or more. One such network protocol is Fiber Distributed Data Interface (FDDI). 
This network has a physical transmission rate of 100 Mbps. Analytical and simulation 
studies have shown that te FDDI protocol should provide actual throughput of 80% to 
95% of this physical rate. Can the end user expect to see this kind of performance? If not, 
then what kind of throughput can actually be expected and where are the bottle necks? 

In or *~ to answer these and other related questions, two areas were studied: First, a 
perform. -< ¢-. parison between a 40MHz SPARCstation 10 workstation and a SOMHz 
SPARCstane:. 1(' workstation. was conducted using the Neal Nelson commercial 
benchmark tool. Next, a well-known network measurement tool, ttcp, was used to obtain 


data transfer rates while varying several tunable operating system and network parameters. 





The parameters varied were: ‘Target Token Rotation Time, TCP/IP window size, NFS 
asynchronous threads, Logical Link buffer size and Maximum Transfer Unit size. The 
results from the commercial benchmark analysis were used to determine if there are any 
differences which can affect transfer rates between the two workstations. 

The results from the commercial benchmark tool clearly showed that the newer, higher 
speed processor is faster. The network tool scp showed that the TCP/IP window size had 
the largest impact on throughput performance. Throughput more than doubles from a 


window size of 4k to a window size of 20k.This is followed by having more than one 










workstation transmitting data simultaneously. Having two workstations transmitting nearly 
halves throughput. This is followed by having a faster processor. A measurement of file 
transfers using rcp system calls showed that the largest impact on file transfer speed is the 
overhead of receiving the transferred file. 


Avail and | or 
Special 


sg ie see a ear a ie 








TABLE OF CONTENTS 





I: EN TRODUCTION (ciicsscssts ating tree tec erat ee tenant ae ais ] 
A: “BACKGROUND Srssciscccccisiin en edited eae ee l 

B... OBJECTIVE ssisscaiscctiestsnscnastease Bice Lrestieencccalia: Nicchapeasceutercncneta a anelataa es 2 

C. SCOPE, LIMITATIONS AND ASSUMPTIONS ..............ccscsccssssscececseeseeseceees 3 

D. ORGANIZATION OF THESIG.............cssscscscsssessssssecssssssccrcecersassesccaceceasereeeacecs 3 

TH. (NETWORK PROTOCOEUS si iscsssissasssucsesstcsscistencessoreiis sohsasteai lesa ancchcnaaeaieeucnten! 4 
A: NETWORKING THEOR Y | ices csscccsscaseissscsssssshznsstsssosdasech canteens cacanscdeaeieseadies 4 

B. OPEN SYSTEM INTERCONNCETION.............c:cscscssscscsssssssesscessesscesecrensenesees 4 

C. TRANSMISSION CONTROL PROTOCOL/INTERNET PROTOCOL ......... 6 

Ts. EAN LBV Ot ss cciscsscntcarostcncaraceeen bse daealst tical ach ales dearnetentecneratee 7 

Ze INCCWOTK LAVOE icc cti cass ccaiesteassecadsscecssosiusa ms saokivatotsanacin daprasiaucientgudecdenes 7 

Das SUYBMSPOnt Layer ons seisas shes aciuysescesiecthaverateceasacasassubecequssecsbasstussateaeuseeuns 7 

B.° Ppplacation Layer sas sess sccincsc ccskcscoxccesscnsbsactiovaleacuseass ei tesaseeseavsaniosucauacinesen 8 

D. FIBER DISTRIBUTED DATA INTERFACE .............cccscscssssscsssesesececeesenseeres 10 

1. Fiber Distributed Data Interface Basics.......... pekbabensablitdactiseh Gnuniactssitunpiius 10 

2. Fiber Distributed Data Interface Layes ......... ......ssssccsssssssssesecesersenesees 10 

a. The Physical Medium Dependent Layer ................sscsssssssssssssereseeees ie 

be. “The Physical Layer isis scstsciss aticsscstnnenniaaiendadtsiniaicains 13 

c. The Media Access Control Layer ...............cscssccssssssesssscsesescesseseseseeees 13 


d. The Station Management Laye.....................ccsscccssssscessnescessscsersoeneees 13 








FIBER DATA DISTRIBUTED INTERFACE PARAMETER ...........::00000+ 17 


F; 
TH: "NETWORK. EQUIPMENT 2oeciicesinc aan eran ee eee ies 19 
AW. INETWORK OVERVIEW §oscsscscts coceseoonestendtcsssectucscspa ative accesso Vocent 19 
Ly. “Fiber Optics Equipment iascisccscicsscscespcees coeesccoti cat hie sbes a Revo tlaeeaccestesinees 20 
2. Network Peripherals’ Interface. ...............:ssssscssssssensscssssessrsessssssserereseeses 20 
3. Silicon Graphic’s Interface ................:ccscssssesssscesssceccescsssessessncseccesseesenees 21 
B. WORKSTATION OVERVIEW .......secssssssssssssssscsssessssecsssccsssessnessssecsseeesseesssess 22 
1. SUN SPARCstation 10 system 0.0.0.0... ccscsssecscccssesscesssecsccssssscessccseseecssees 22 
&,. -  SOttWare Alchitectre «ju; casesus secs saresssacescsssteteana accede eviaanneisice 22 
:. . SHard ware Architect ne iis sscsss sisectsdescs cokeststasuasecrssesaslaceecacemeansoenasenne 23 
2. Silicon Graphics IRIS Indigo..................ssssessssccssssessescnsseeseerscseecenssessesses 25 
Bi SOTWATE ATCC CHUTE 5555 ccaksveisionstiesvchevovcsnvsestadaveaseadvessaenscnyereavetants 25 
b. Hardware PAS CHC CUI si iscssecscssesscntnaesdconotestncel sen cbnspeeavsensuansassanseseens 25 
TV« TEST DESIGN PLAN 0 siesciessscissssscaiencs svcsiveonscvooncsetaavesedeces sedecduarsossebeaiueesestiaassasccousas 28 
Bis: MESES VRAD EGY si cietssancivccsttescnsts ceacenitus te casesstpasersiteplaaeisgnaacciibieadensteitactents 28 
B. NEAL NELSON BENCHMARK..............scssccsssssssssssssssoesssscsssssseesereressessseeeees 28 
C. NEW TEST TRANSMISSION CONTROL PROTOCOL ............s:ccscssesseseess 31 
D. REMOTE COPY PROTOCOL TRANSFER ..........s.cssssscsssssssssssssssceserssennss sees 34 
E. PARAMETERS WHICH AFFECT BOTH TEST ..............scssssscscsssssssssssossees 36 
F. FILE SIZES FOR BOTH TRANSFERS ................0.scscsssssessssossssssessssecscsesesssnes 36 
G. SYSTEM CONFIGURATIONS FOR ALL TESTS ............ccsscccssssssssssssseossers 37 
H. PARAMETER BASELINE «...csccieisssosnsesscccsessseasssosvescscesesenssssoasonsasesasstcasasanises 39 
V. TEST RESULTS AND ANALYSIS .............cscssssscsssssscsssssscsessssscesscsecssscesnssseeesesers 41 
A. NEAL NELSON BENCHMARK..............:cscssssssccsssssscsesssssscesssserescesscceateoseeess 41 
1. Gold Versus White, Two Processors and Solaris 2.3 .....ssssssssssssssssssssssse 42 
2. Gold One Processor Versus Gold Two Processors and Solaris 2.3.......... 43 








3. Gold With One Processor, Solaris 2.3 Versus SunOS 4.1.3...............c000 44 


B. NEW TEST TRANSMISSION CONTROL PROTOCOL ..........e cee 45 

L.. Single Processor Resuhes is. scssciuiciccecsstecscessises cass sacusevctecatsusdaissracoesdaadeanecanss 47 

2: "EWO PROCESSOR RESUIUS cis6isssisesccscassasce sea cesossieidsicsensnteactasnletentiowisncicysienss 51 

3. One And Two Processor Results ...............:sssssssssescsscecssceesceetercessencesesaeees 53 

C. REMOTE COPY PROTOCOL TRANSFERS .0..cscssssssssssssssssssseesssssssssseessesen 60 

D. ANALYSIS SUMMARY ........ : Mulan sia haan aaa 64 

VI. CONCLUSIONS AND TOPICS FOR FUTURE RESEARCH................cscsssssssseess 67 
Bic “CONCERISION secnsiigssccsvqectocnees cass ctu Shade cascuaiocgentastcates papas csencsenedevasate on oeooedente 67 

1. Workstation Conclusions ..............scscssssssssssseessesesssscecsssscessesssecescevescesees 67 

2. ‘Throughput Conclusions .......c.cccscsssssensonsasseaccescscasessesacsacosseennasasenonsesscess 68 

B. TOPICS FOR FUTURE RESEARCH ..............csssssccssssscsssssecessceseeseceeceasseeeeees 70 
APPENDIX A: NTTCP PROGRAM and TEST SCRIPTS .............ssssssssssssscecesessssseresees 72 
APPENDIX B: RCP PROGRAM.............scccssssscssssssssersssssesnssssesececsacecsasassccesesecessecceseceaees 88 
APPENDIX C: NEAL NELSON BENCHMARK RESULTS..............cssssssssssssssseensssssees 91 
APPENDIX D: NTTCP SINGLE PROCESSOR RESULTS. ............cscsscsssssssecececssssseees 106 
APPENDIX E: NTTCP TWO PROCESSORS RESULTS ............ccscsssssssssssosesecseceeoees 119 
APPENDIX F: GLOSSARY OF TERMG...............cccssssscessese addin tdbsvonsusyotabuasepannaseiess 138 
LIST OF REFERENCES. oicicsjoivetesssccastiictainadiicanicieundaibauiiaotinwascansl 142 
BNETTAL DISTRIBUTION LIST ociccissssocscseusiisazssacadnscsnshacvacutunsctavanceeascastavasenteatasansearss 144 


vi 











LIST OF TABLES 


TABLE 1: RCP FILE SIZES AND ASSOCIATED OVERHEAD .....0......eecsccsssssssesseseeeseseeoees 37 
TABLE 2: FILES (DATA SIZES) FOR NTTCP TEST ...0........ccsssssssssssessssssssssssessonssecesessseresees 37 
TABLE 3: DEFAULT PARAMETERS USED FOR ALL THREE TEST ...........:ssesssssssssssssees 39 
TABLE 4: TEST RESULTS IN SINGLE PROCESSOR MODE ........ceccsscssseseeseceseeseeseeeeee 40 
TABLE 5: FILES (DATA SIZES) FOR NT TCP TEST ..000.0.........sssccsscssssscssssceessessscsscessesesseseres 46 
TABLE .6: RESULTS OF SAS PREDICTIONS ss.scccssssssessscecssssnsssensscessoresosensersnsn svsuesiensncicnasuess 56 
TABLE 7: RCP ONE PROCESSOR TRANSFER RESULTS 1.0... e.csssssssssccsssesssesseeeeseseseesenes 61 
TABLE 8: RCP TWO PROCESSOR TRANSFER RESULTS. 000.0... essssssssssssssersosessscsssseesesees 62 
TABLE 9: CPU SUBSYSTEM .........ssssscssssssssssessssssssssssessssssssnecssssesssnscensnecsssnsccesnsensanenssneesenaeess 91 
TABLE 10:; DISK «SUBS YSTEM 'sicsssecescssssscdscsessconscieconentsSipeesuisdebectessaiiesccsecetteseovacocaseateeatanss 91 
TABLE 11: CACHE INFORMATION issicssssssssesciestyevososdtsceseisaticcecsedipsssctanttsuysdeceibescsedivetpeistas 92 
TABLE 12: GOLD2.SOL VRS WHITE2.SOL, TEST 1 & 2 & 3 8.4 w..cesssssessssssesescssssensecrseees 94 
TABLE 13: GOLD2.SOL VRS WHITE2.SOL, TEST 5 & 6 & 7 & 8 u....ecssssssssssesessssssesssessseees 94 
TABLE 14: GOLD2.SOL VRS WHITE2.SOL, TEST 9 & 10 & 11 8 12 wo... essseesssseeeeees 95 
TABLE 15: GOLD2.SOL VRS WHITE2.SOL, TEST 13 & 14 & 15 & 16 .......ceseeesesssssseeeees 95 
TABLE 16: GOLD2.SOL VRS WHITE2.SOL, TEST 17 & 18 & 19 8 20 .0...sesssssssssssseees 96 
TABLE 17: GOLD VRS WHITE2.SOL, TEST 21 & 22 & 23 & 24 uu....s.ssssssesssssssssssssecessesseees 96 
TABLE 18: GOLD2.SOL VRS WHITE2.SOL, TEST 25 & 26 & 27 & 28 uu... cssssssssssssssssseees 97 
TABLE 19: GOLD2.SOL VRS WHITE2.SOL, TEST 29 & 30 .0.........ccccssssssscssssesseesescsssesceseees 97 
TABLE 20: GOLDI.SOL VRS GOLD2.SOL, TEST 1 & 2 & 3 & 4 wwe escsesssessessseseseneees 98 
TABLE 21: GOLD1.SOL VRS GOLD2.SOL, TEST 5 & 6 & 7 & 8 .o...eecscsscsssessscssesseceseseens 98 
TABLE 22: GOLD1.SOL VRS GOLD2.SOL, TEST 9 & 10 & 11 8 12 cecsssssssssssssssssssssssssseee 99 
TABLE 23: GOLD1.SOL VRS GOLD2.SOL, TEST 13 & 14 & 15 & 16 ......cessessssssssssssesceees 99 
TABLE 24: GOLD1.SOL VRS GOLD2.SOL, TEST 17 & 18 & 19 & 20 .0.....essssssssssssesseees 100 
TABLE 25: GOLD1.SOL VRS GOLD2.SOL, TEST 21 & 22 & 23 & 24 uu... .sssessssssssssessseees 100 
TABLE 26: GOLD1.SOL VRS GOLD2.SOL, TEST 25 & 26 & 27 & 28 ........ssssssssssssssseceees 101 


vii 

















TABLE 27: GOLD1.SOL VRS GOLD2.SOL, TEST 29 & 30 .o.cccccesccnsssseseesesetecneeenensenes 101 


TABLE 28: GOLD1.SOL VRS GOLD1.SUN, TEST 1 & 2 & 3&4 .eeccceseeseeseteeeeeeneees 102 
TABLE 29: GOLD!I.SOL VRS GOLDI.SUN, TEST 5 & 6 & 7&8 oo.ceceececectssesseteteeeteeeeeees 102 
TABLE 30: GOLD1.SOL VRS GOLDI.SUN, TEST 9 & 10 & 11 & 12 woe seeeeeees 103 
TABLE 31: GOLD1.SOL VRS GOLDI.SUN, TEST 13 & 14 & 15 & 16 ...cecccceeeeseeeeees 103 
TABLE 32: GOLD1.SOL VRS GOLD1.SUN, TEST 17 & 18 & 19 & 20 wees tceesenees 104 
TABLE 33: GOLD1I.SOL VRS GOLD1.SUN, TEST 21 & 22 & 23 & 24 woeecceesecssssseesees 104 
TABLE 34: GOLD1I.SOL VRS GOLD1.SUN, TEST 25 & 26 & 27 & 28 .....eeecccssscssesesseeees 105 
TABLE 35: GOLDI.SOL VRS GOLDI.SUN, TEST 29 & 30 .....eessessscsssssssessscseseeseeneneasees 105 
TABLE 36: SINGLE PARAMETER TEST RESULTS. ........ssssssssssssssssssssssseesssneessaneennsnecenanes 106 
TABLE 37: SINGLE PROCESSOR, 1ST TEST RESULTS... sescscsssctssssssnsesessterseenes 107 
TABLE 38: SINGLE PROCESSOR, 2ND TEST RESULTS ...............scssssssssessssessssersseteesesaee 107 
TABLE 39: SINGLE PROCESSOR, 3RD TEST RESULTS. .............ccsssssssssssssessscessssesseeeeseees 107 
TABLE 40: SINGLE PROCESSOR, 4TH TEST RESULTS ...........cccccsssssscsssssscsseresesssseeeeeees 108 
TABLE 41: SINGLE PROCESSOR, 5TH TEST RESULTS .00........:ssssssssssssssssssesecersesssessenenee 108 
TABLE 42: SINGLE PROCESSOR, 6TH TEST RESULTS. ............:ssssssesssssssoescesesseneseserenes 108 
TABLE 43: SINGLE PROCESSOR, 7TH TEST RESULTS .00.........cccsssssssssssssscesssseseseeesesenee 109 
TABLE 44: SINGLE PROCESSOR, 8TH TEST RESULTS = sigasdbccaatessabsssnsledasbasubensssuarcdey 109 
TABLE 45: SINGLE PROCESSOR, 9TH TEST RESULTS. ............csscscsssssessseesessseeessenensenees 109 
TABLE 46: SINGLE PROCESSOR, 10TH TEST RESULTS. ...........csscsssesssesscseentnreseeesneese 110 
TA! €47: SINGLE PROCESSOR, 11TH TEST RESULTS. ......... cc ccsessssseessssnsseeseeeseseeees 110 
TALE 48: SINGLE PROCESSOR, 12TH TEST RESULTS. .0.......cccccsscccsssssseesssssceesesneeessees 110 
TABLE 49: SINGLE PROCESSOR, 13TH TEST RESULTS. .............:ccssscsssssssssssesssssssseeeenees lit 
TABLE 50: SINGLE PROCESSOR, 14TH TEST RESULTS ..........c.scccsssssssssssnsssseressnssseeneses 111 
TABLE 51: SINGLE PROCESSOR, 15TH TEST RESULTS. .............cccsssssscssssssesssceescnsnseeeees 1 
TABLE 52: SINGLE PROCESSOR, 16TH TEST RESULTS. ............scscsssscssssseessssssecsesesesesees 112 
TABLE 53: SINGLE PROCESSOR, 17TH TEST RESULTS. ..............cscssssssssssssssssseessssseeesnes 112 
TABLE 54: SINGLE PROCESSOR, 18TH TEST RESULTS. ...........:.c:sscssssssessesssscsseesssseseeeecs 112 


TABLE 55: SINGLE PROCESSOR, 19TH TEST RESULTS. ............c:cscsssssscsssssecscesesvsseeseenee 1.13 








TABLE 56 
TABLE 57 
TABLE 58 
TABLE 59 
TABLE 60 
TABLE 61 
TABLE 62 
TABLE 63 
TABLE 64 
TABLE 65 
TABLE 66 
TABLE 67 
TABLE 68 
TABLE 69 
TABLE 70 
TABLE 71 
TABLE 72 
TABLE 73 
TABLE 74 
TABLE 75 
TABLE 76 
TABLE 77 
TABLE 78 
TABLE 79 
TABLE 80 
TABLE 81 
TABLE 82 
TABLE 83 
TABLE 84 





: SINGLE PROCESSOR, 20TH TEST RESULTS. .0.......cccecseseesccesseseesenseseenenessens 143 
: SINGLE PROCESSOR, 21ST TEST RESULTS. .........cccccesesssseeseescnneeeeeneenesees 113 
: SINGLE PROCESSOR, 22ND TEST RESULTS. .............:s:ccsscssssssecssssssecsenssensoes 114 
: SINGLE PROCESSOR, 23RD TEST RESULTS. oo... cecsscesseeesecseeseeeeneeeees 114 
: SINGLE PROCESSOR, 24TH TEST RESULTS. ...........cccsssssssssssssseesenseneeseneeees 114 
: SINGLE PROCESSOR, 25TH TEST RESULTS. .u.......cscsssssssssssessssscsseeessesonsees 115 
: SINGLE PROCESSOR, 26TH TEST RESULTS .....sssssssssssssssssssssscsssssssseseeesssssees 115 
: SINGLE PROCESSOR, 27TH TEST RESULTS. .........cccssssssssecesesessesssssecesseesees 115 
: SINGLE PROCESSOR, 28TH TEST RESULTS ..........c.cccsssssssssesessessssseersesesesees 116 
: SINGLE PROCESSOR, 29TH TEST RESULTS. ...........csscssscsssesssssssecsenesseeers 116 
: SINGLE PROCESSOR, 30TH TEST RESULTS ..........ccsssssssssssssecssssesseesssseesoes 116 
: SINGLE PROCESSOR, 31ST TEST RESULTS, ..0..........cssscscsssestecssseesssesesnensees 117 
: SINGLE PROCESSOR, 32ND TEST RESULTS ..........cccssssssssssscsescssssssssensessesees 117 
: SINGLE PROCESSOR, 33RD TEST RESULTS. ...........cscssssssesscerssceescsesssnenesesees 117 
: SINGLE PROCESSOR, 34TH TEST RESULTS. ...........ccccscssssssssssssesesesseentssencees 118 
: PARAMETERS USED FOR TWO PROCESSOR TEST ............scescscsscsossteeseeees 119 
: TWO PROCESSORS, 1ST TEST RESULTS. .........csccscssssssssssssssseesseeseseesenssceseee 120 
: TWO PROCESSORS, 2ND TEST RESULTS ..........-cssssssssssscssssssssesceescnearsreeeeeees 121 
: TWO PROCESSORS, 3RD TEST RESULTS. .........ccssscssestsserescesceresseccersnsensencees 121 
: TWO PROCESSORS, 4TH TEST RESULTS. ........ccccsccssscsseseesersnsesessesecsnseeeneess 121 
: TWO PROCESSORS, 5TH TEST RESULTS. ..........ccccsssssscsssssccseseseerssoreeeeeeseeses 122 
: TWO PROCESSORS, 6TH TEST RESULTS. ..........ccscscssssssssssecsseseseeresereeeeseeeenes 122 
: TWO PROCESSORS, 7TH TEST RESULTS. ..........csscsssssseressssseesseeresseeereesenees 122 
: TWO PROCESSORS, 8TH TEST RESULTS. ..........ccscsssssssssssssccssrsceaesseeeeeeseoesees 123 
: TWO PROCESSORS, 9TH TEST RESULTS. ..........:cscssscssssesesssessecssecessensenensenees 123 
: TWO PROCESSORS, 10TH TEST RESULTS, ..........cccccssesssssssssesessesssseensseeenses 123 
: TWO PROCESSORS, 11TH TEST RESULTS, ...........ccscsssssssssssseseseesesseesensesenses 124 
: TWO PROCESSORS, 12TH TEST RESULTS. .............csssscsssssssssssesensssseeteresesnes 124 
: TWO PROCESSORS, 13TH TEST RESULTS. ...........cccsssesssssssssessescecssneesrsnnsese 124 








TABLE 385: 
TABLE 86: 
TABLE 87: 
TABLE &&: 
TABLE 89: 
TABLE 90: 
TABLE 91: 
TABLE 92: 
TABLE 93: 
TABLE 94: 
TABLE 95: 
TABLE 96: 
TABLE 97: 
TABLE 98: 
TABLE 99: 


TABLE 100: 
TABLE 101: 
TABLE 102: 
TABLE 103: 
TABLE 104: 
TABLE 105: 
TABLE 106: 
TABLE 107: 
TABLE 108: 
TABLE 109: 
TABLE 110: 
TABLE 111: 
TABLE 112: 
TABLE 113: 








14TH TEST RESULTS: issic cscrtostatiesestencenieastinnaciyay 25 


TWO PROCESSORS, 
TWO PROCESSORS, 15TH TEST RESULTS ....:sccsccseicsssnssssinsrsonnetsdacscsocesraes 125 
TWO PROCESSORS, 16TH TEST RESULTS .00..0....c.ccccceceesscseseseeeseeeteseensens 125 
TWO PROCESSORS, 17TH TEST RESULTS. .............:ccssscssssssssssssssseereeneseesssnes 126 
TWO PROCESSORS, 18TH TEST RESULTS. .......cceecccesesssesssseseseeeeseneenenesenes 126 
TWO PROCESSORS, 19TH TEST RESULTS ...0.......cessssscssssessssssnetseesteeeesenees 126 
TWO PROCESSORS, 20TH TEST RESULTS scsunaninnaiiiinonnnces 127 
TWO PROCESSORS, 21ST TEST RESULTS ..........cccscssssssessesscsseseeeseessceseees 127 
TWO PROCESSORS, 22ND TEST RESULTS ..........ccccssssssssssssscessseseesessseeseeees 127 
TWO PROCESSORS, 23RD TEST RESULTS «0.0.0... escccesssstssscreseeeeseesseeees 128 
TWO PROCESSORS, 24TH TEST RESULTS ...........ccccsscsssssssstsssseseseeeersseeeens 128 
TWO PROCESSORS, 25TH TEST RESULTS ..0.........cscsscssssssssssesssseesssesensessees 128 
TWO PROCESSORS, 26TH TEST RESULTS. .......cccsssescsssssscscssesseseresesseeeneees 129 
TWO PROCESSORS, 27TH TEST RESULT7 ...........ccccssssssessssssesssscecseeeseneeeenes 129 
TWO PROCESSORS, 28TH TEST RESULTS ...........:sssssssssssssscscscsseeeessseesseeenes 129 
TWO PROCESSORS, 29TH TEST RESULTS .......c.cscscsessesesssesessssetteeeesensesseees 130 
TWO PROCESSORS, 30TH TEST RESULTS, .........cccscssssssssssssesessseseseeteneaeees 130 
TWO PROCESSORS, 31ST TEST RESULTS. ...0........cessscscscssssesecesesereseseseses 130 
TWO PROCESSORS, 32ND TEST RESULTS .............scecsssssssssesesssesessseseensseees 131 
TWO PROCESSORS, 33RD TEST RESULTS ...0........scscssssssssssssssssersssssescesess 131 
TWO PROCESSORS, 34TH TEST RESULTS. ...........ccscssscssssccssssssseteseseseseeseees 131 
TWO PROCESSORS, 35TH TEST RESULTS. ...........cccsssssssssssssscsseseseeesssseeease 132 
TWO PROCESSORS, 36TH TEST RESULTS. ............cscscssssessssssecsesseeeseseceeers 132 
TWO PROCESSORS, 37TH TEST RESULTS. ..02.......ccssscsssssssssesseeeteseseesenensees 132 
TWO PROCESSORS, 38TH TEST RESULTS. ............ccssscsesessssssssseeseeseseesseeees 133 
TWO PROCESSORS, 39TH TEST RESULTS. ............:cesssssssssssssseseceseseseeresseees 133 
TWO PROCESSORS, 40TH TEST RESULTS. ..........cssssssssssessscessseesssescenesesens 133 
TWO PROCESSORS, 41ST TEST RESULTS, ..........ccessesssssssssssscssssessenseeseeneees 134 
TWO PROCESSORS, 42ND TEST RESULTS ..........cccssssssssessscssssetesseessenneones 134 








TABLE 114: TWO PROCESSORS, 43RD TEST RESULTS .........cccccssssseeteeseneeeesteeeeteseneeee 134 


TABLE 115: TWO PROCESSORS, 44TH TEST RESULTS. ........:cccsccscssssssecsesseereeeseseeneaees 135 
TABLE 116: TWO PROCESSORS, 45TH TEST RESULTS. .........cccccssesseseesesseseesetenseeecseeaees 135 
TABLE 117: TWO PROCESSORS, 46TH TEST RESULTS ..........cccccssssessseessesesseteeenenseeees 135 
TABLE 118: TWO PROCESSORS, 47TH TEST RESULTS 00000... cece csesseesssscseeeeteceeeeseee 136 
TABLE 119: TWO PROCESSORS, 48TH TEST RESULTS .0.......cccccccsssssssesessssseeseeessnesaeeres 136 
TABLE 120: TWO PROCESSORS, AUTH TEST RESULTS .....ssssssssssesssssesssssscssssseesseesssseses 136 
TABLE 121: TWO PROCESSORS, 50TH TEST RESULTS... ccs csecsetsssesessesseteseneeseeees 137 
TABLE 122: TWO PROCESSORS, 51ST TEST RESULTS ..0......ccccccccsesssseesseseessseseeesreeeneers 137 
xi 











LIST OF FIGURES 


Figure-t: [ISO-OSI Reference Model \s.cseessscccssescosssiecsoevnasevss odkensshansssenssneisescuasqossersoocescssestes 5 
Figure 2: The Four Layers of the TCP/IP Protocol Suite 2.0.0.0... .sscssssssssscsscecesssssecseeeeees 7 
Figure 3: [P Header ............cssesscssceeeesseeee Wceiss ws ciasoaued dave naa snedeteeneelapamatiseasacienviniee mm cutwniles 8 
Figured: TCP Meader i sccicsosiassscasseis vescsesssscavypstacerstacspesacdsvvsouatesssayoaes coed suas Sitogene te ieeaeseleioeies 9 
Figure 5: Relationship Between FDDI and ISO-OSI Layers 000.000... essscsssessercccenescere AW 
Figure 6: Block Diagram of the FDDI Layers ................csscsssscsssessessscssssessessecestesseenssoeees 12 
Figure 7: FDDI Frame Format ..............s:cscsscsccoscssceccscceccsccecesscsscsccsssscesscseesscercensenseaseneeeces 14 
Figure 8: Composition of FDDI Frames and Percentage of Overhead .............:cccescesceree 16 
Figure 9: Timers and Counters Used in Data Transmission ...............csccscsssssccsssscssosscasees 18 
Figure 10: NPS’s FDDI Research Network ............ssssscsssesesssessssssesesesecssecensasacasecssesencases 19 
Figure 11: Sun-4m Architecture Used in the SPARCstation 10 System ............scssssseseees 24 
Figure 12: The IRIS Indigo CPU Board .............scssssscssscsssssssececessssecscesecesessssasececesessesceeees 26 
Figure 13: Flow of Data Across the FDDI Network Using the RCP Command .............. 29 
Figure 14: Example of setsockopt and getsockopt System Calls ............ccscssssssssssseeseeees 33 
Figure 15: Implementation of RCP System Call .............ccccssssscscssssscesesessescsetsencsceseeceenes 34 
Figure 16: Gold Versus White, Two Processors .........ccsscsssssssecsecsesensssersecsscensesenesoeesessees! 42 
Figure 17: Gold One Processor Versus Gold Two Processors. ..........cccscssscescssseetesssseessenes 43 
Figure 18: Gold, One Processor, SunOS 4.1.3 Versus Solaris 2.3 ..........cccccsscscsscescsessesees 45 
Figure 19: NTTCP Output for File Size of 4194304 Bytes. ..............cccsssssssssetecssesseseseesees 46 
Figure 20: SAS Analysis of Single Processor Transfers ....... sDiesaadirascetutaasechindeamskeadntagead 49 
Figure 21: Single Processor, File D Transfer From White to Gold ..............-sscecscsessseeres 51 
Figure 22: SAS Analysis of Two Processor Transfers ...............csccccscsssssssssssssssseesecescaees 52 
Figure 23: SAS Analysis of Single and Two Processor Transfers. ..............cc.ssesccsscseseesees 54 


xii 





Figure 24: White Single Processor vrs White TWO ProcesSOFS .......::ccscsersreeseeneeteeenes 55 


Figure 25: SAS Throughput Prediction ............:cscscssseseesssesscssssseeeeseseecescssesesssescseeeenees 56 
Figure 26: Relative Importance of Each nttcp Parameter ...........:cccssesesersessseesseeeneneeeeee 5% 
Figure 27: Throughput Comparison Between White and Gold .............:.sssssscsseseseeteeees 60 
Figure 28: RCP File Transfers From Gold To White .0.......... cc cessssssssscssssccsessseseeeeseeeeeees 63 


I. INTRODUCTION 


A. BACKGROUND 


Data communication networks are now an essential part of our society. Our 
technology base has given us workstations which can process data at speeds which makes 
mainframes from just a few years ago look slow in comparison. Now, not only must we 
process the data faster, but we also distribute the information to other locations at speeds 
which just a few years ago were impossible. We truly are in the information era. 

In the 1960s and 1970s, the computer industry worked hard to develop new 
technologies which would give us faster, more powerful computers. The dramatic advances 
in integrated circuits technology made possible the wide availability of larger, more 
powerful super computers, low-cost workstations, and personal computers [ALBE94]. 
There were the companies which believed that the large, centralized processors were the 
solution to everyone’s problems. At the same time, other companies developed smaller 
computers called minicomputers. These minicomputers, and their successors, desktop 
workstations, started filling the needs of small companies and universities which couldn’t 
afford the cost of large mainframes and did not need the processing power provided by the 
large, all in one solution provided by the mainframe. 

In the world of mainframes, the need to distribute data to other computers was not 
critical. The single mainframe would handle all of a company’s processing needs. If there 
was a need to handle additional processing, the manufacturer of that mainframe provided a 
solution which would allow their mainframe to communicate with another of their 
mainframes. This of course ensured that the company or university continued to buy all or 
most of their computer equipment from the same computer manufacture. 

With the growth of the minic2mputers and the workstations came the need to connect 
these less expensive and less powerful machines. This provided the motivation and the 











driving force behind the development of Local Area Networks (LAN). There were the 
proprietary options provided by the computer manufactures. However. with the need to 
provide connectivity between systems came the desire to have connectivity between 
systems from different manufacturers. This was very difficult without some sort of agreed 
upon standards. In the late 1970s, the International Standards Organization (ISO) 
developed the Open Systems Interconnection (OSI) reference model to serve as the basis 
for future open networks. This model would provide the basis for computers from different 


vendors to be able to communicate with each other [ALBE94}. 


Now we have the beginnings of connectivity between computers and the beginnings 
of smaller, more powerful computers. In the 1980s, Sun Microsystems started producing 
their line of desktop workstations. Within a few years, these workstations were being based 
on new Reduced Instruction Set Computer (RISC) technology which allowed Sun 
Microsystems and other companies to produce faster, more powerful workstations. Now if 
we combine the advancements of the desktop workstations with the advancements made in 


networks, we have the true beginnings of the information era. 


The question now becomes one of which technology is advancing faster. Are we 
producing workstations which can exceed the capability of the networks or are the 
networks staying ahead of the abilities of the workstations. Also, advancements in 
workstation technology isn’t just limited to faster hardware. Is the operating system and its 


networking tools keeping pace with current demands? 


It is clear that the workstations are faster and more powerful than in the past. It is also 
clear that the networks can handle more data at faster rates than in the past. But where do 
we stand if we compare a recently released product produced by Sun Microsystems with 
one of the current high speed networks such as Fiber Distributed Data Interface (FDDN? 


B. OBJECTIVE 


The objective of this thesis will be to measure actual throughput between high 
performance workstations over an FDDI network to determine what bottlenecks, if any, 





exits between Sun Microsystem SPARCstation™ 10 multiprocessors running Solaris™! 


2.3 and the Network Peripheral™ SBus FDDI Network Interface cards and to evaluate 
Transmission Control Protocol/Internet Protocol (TCP/IP) as a high speed transport 
protocol. This process will require an analysis of the workstations being used in this study. 
an understanding of current network operating system tools and measurements of data 
transfers across the network being tested. ; 

This is not simply a matter of reading the vendor’s proniotonal literature and seeing 
which aspect of the distributed processing environment is more capable. Vendors normally 
promote those aspects of their products which they can demonstrate as performing at or 


above some threshold. This threshold may or may not be value to the consumer. 


C. SCOPE, LIMITATIONS AND ASSUMPTIONS 

The scope of this investigation is limited to performing testing and tuning at the level 
available to any system administrator. No modifications are made to any hardware or 
changes made to the workstation kernel which are not considered tunable parameters. From 
this investigation, a determination will be made as to whether or not there are any 
bottlenecks. 

It is assumed that the changes made and the results observed on the SPARC 10 
multiprocessors running Solaris 2.3 can be extrapolated to other vendor’s hardware and 
software. If we note that changing the TCP/IP window size on our workstations results in 
a 10 fold increase in throughput, then we assume comparable results would be observed on 


other vendor’s workstations. 


D. ORGANIZATION OF THESIS 

This thesis is organized into seven chapters. This chapter provides the introduction and 
scope of work to be performed. Chapters II and II] provide a background on networks in 
general, FDDI specifically and the specifics on the workstations involved in this 
investigation. Chapters IV and V cover the methodology, test results and analysis of results. 


Chapter VI covers what conclusions can be derived from these results. 











I. NETWORK PROTOCOLS 


A. NETWORKING THEORY 

The primary focus behind the development of network protocols has been the 
organization of the protocol into a series of layers. This has allowed the design of the 
protocols to be simplified by focusing attention at each layer upon that layer’s function and 
its interaction with the layers above and below. The purpose of each layer is to offer certain 
services to the layer above without the higher layer needing to know how those services 
were provided. 

When designing a network protocol the network designer must determine how many 
layers the protocol will have, what those layers will do and how the layers will 
communicate with each other. This last decision, deciding how the layers will 
communicate, is one of the more important considerations. A clean-cut interface must be 
defined which will minimize the amount of information that must be passed between 
layers. 

The set of layers and protocols is know as the network architecture. Enough 
specification must be given for each layer of the protocols so that vendors can write their 
versions of the protocol for their computer architecture. This is what makes the network 
architectures beneficial to everyone accessing a network. By having an agreed upon 
network architecture that everyone is willing to use, we can have distributed processing 
over heterogeneous processors [MINO91]. 


B. OPEN SYSTEM INTERCONNCETION 

The Open System Interconnection (OSI) reference model, Figure 1, was proposed in 
1978 to promote compatibility between network designs. This model was approved as a 
standard [ALBE94] in 1983 by the International Standards Organization (ISO). The 
reference model is not a protocol or set of rules but a layering of required functions, or 
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services, that provides a framework with which to define protocols. In practical terms. OSI 
is seen as a means of developing communications networks which are not restricted by the 


need to conform to a rigid set of manufactures’ proprietary standards and protocols. 






Open Relay Systems 


Physical Media for Interconnection 


Figure 1: ISQ-OSI Reference Model 





The purpose of these seven layers is to define the various functions that must be carried 
out when two machines communicate. Each of the seven layers is architecturally 
independent, so that the relevant protocols and service functions of each layer can be 
developed independently. The seven layers of the model can be roughly divided into two 
parts; the first four layers, physical to transport, provide the telecommunications functions 
and operate on a node-to-node basis. The top three layers, session to application, are 
concemed mainly with carrying out processing functions and creating a meaningful dialog 
between the user and the application. 

Below are the seven layers of the OSI model [STAL91): 


¢ Layer 1: Physical Layer 
¢ Layer 2: Data Link Layer 
¢ Layer 3: Network Layer 








¢ Layer 4: Transport Layer 

¢ Layer 5: Session Layer 

¢ Layer 6: Presentation Layer 
¢ Layer 7: Application Layer 


C. TRANSMISSION CONTROL PROTOCOL/INTERNET PROTOCOL 


The Transmission Control Protocol/Intemet Protocol (TCP/IP) protocol is also 
structured as a series of layers. Each layer is designed for a specific purpose. They are 
designed so that a specific layer on one machine sends or receives exactly the same object 
sent or received by its twin on another machine. This is done without regard to what is 
going on in layers above or below the layer under consideration. 

The advantage of layering is that it simplifies protocol design. The designer can 
concentrate on a specific layer without regard to the design of other layers. For example, 
when designing the transport layer of the protocol, the engineer need be concerned only 
with assuring that a packet received by one machine is identical to the packet sent by 
another. The message contained in the packet is of no concer. The integrity of the message 
is of concern only to the designer of the application layer. 


Members of the TCP/IP family include the Internet Protocol (IP), Transmission 
Control Protocol (TCP), User Datagram Protocol (UDP), Address Resolution Protocol 
(ARP), Reverse Address Resolution Protocol (RARP), and the Internet Control Message 
Protocol (ICMP). The entire family may be referred to as TCP/IP, reflecting the names of 
the two main protocols. 


The OSI model describes an idealized network communications model. TCP/IP does 
not correspond to this model at every level, but instead either combines the functions of 
several OSI layers into a single layer, or finds no need to make use of certain layers. In 
consequence, TCP/IP can be described by a simpler model as shown in Figure 2 [STEV94]. 





1. Link Laver 


The Link layer is the hardware level of the protocol model. It specifies the 
physical connections between hosts and networks, and the procedures used to transfer 


packets between machines. 


Application Telnet, FTP, e-mail, etc. 


TCP, UDP 


device driver and interface card 





Figure 2: The Four Layers of the TCP/IP Protocol Suite 


2. Network Layer 

This layer is responsible for machine-to-machine communications. It determines 
the path a transmission must take, based on the receiving machine’s IP address. The 
network layer also provides transmission formatting services; it assembles data for 
transmission into an internet datagram. If the datagram is outgoing (received from the 
higher layer protocols), the network layer attaches an IP header (Figure 3) to it. This header 
contains a number of parameters, most significantly the IP addresses of the sending and 
receiving host. Other parameters include datagram length and identifying information, in 
case the datagram exceeds the allowable byte size for network packets and must be 
fragmented. 


3. Transport Layer 
The transport layer protocols enable communications between application 
programs running on separate machines. The transport layer assures that data arrives in 


sequence, and without error. It does so by swapping acknowledgments of data reception. 
and the retransmission of lost packets. This type of communication is known as “end-to- 


end”. Protocols at this level are TCP, UDP, and ICMP. 
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Figure 3: IP Header 


TCP attaches a header onto the transmitted data. This header contains a large 
number of parameters, see Figure 4, which help processes on the sending machine connect 
to peer processes on the receiving machine. TCP uses 16 bit port numbers as its addressing 
method. Servers are normally know by their well-known port number. For example, every 
TCP/IP implementation that provides an FTP server provides that service on TCP port 21. 
Every Telnet server is on TCP port 23 ([STEV94}. 


4. Application Layer 
The application layer lets you use various TCP/IP standard internet services. 
These services work with the next lowest level of protocols (transport) to send and receive 
data. These services include telnet, fip, rcp, and the Domain Name Service (DNS). 
telnet. The Telnet protocol enables terminals and terminal oriented processes to 


communicate on a network running TCP/IP. 
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Figure 4: TCP Header 


Jip. ftp transfers files to and from a remote network. Unlike rcp, ftp works even 
when the remote computer is running a non-UNIX operating system. A user must “log in” 
to the remote computer to make an fip connection unless a system administrator has set up 
the computer to allow “anonymous ftp”. 

rep. rcp copies one or more files or hierarchies to and from a remote computer. 
The remote computer must be running UNIX. One must be an accepted user of the remote 
computer (i.e., the user’s name must be in the remote computer’s password database, and 
the user’s machine name must be listed in the remote .rhost file). If this is not the case, a 
user cannot copy anything to or from the remote machine. The user must know the 
complete pathname of the file or directory to be copied. 

DNS. DNS provides host names to the IP address service. It is a distributed 
database that is used by TCP/IP applications to map between hostnames and IP addresses. 
The DNS provides the protocol that allows clients and servers to communicate with each 
other and to provide electronic mail routing information. 








D. FIBER DISTRIBUTED DATA INTERFACE 


1. Fiber Distributed Data Interface Basics 


Fiber Distributed Data Interface (FDDI) is a 100 Mbps high speed LAN standard 
developed under the auspices of American National Standards Institute (ANSI) X3TY.5 
committee. FDDI was developed to create a reliable fault-tolerant, high-speed network 
connecting numerous stations over greater distances than existing standards. Although 
FDDI is somewhat similar to the IEEE U2 standards, it is not part of that family of 
standards [MINO91]. 


The ANSI X3T9.5 committee developed specifications for a network based on a 
dual counter-rotating fiber optic ring using a timed-token protocol, which is capable of 
transmitting data at 100 Mbps in each ring and which can extend to 500 stations over total 
fiber length of 200 km with full system performance. The dual counter-rotating ring can 
support connections up to 2 km with multimode fiber and connections up to 60 km using 
single-mode fiber. 


The FDDI standard allows for two types of traffic: synchronous and 
asynchronous. Synchronous traffic should consist of data which is time sensitive such as 
voice or interactive video. Any delay in the throughput of this traffic has an adverse affect 
of the quality of the data being transferred. Asynchronous traffic should consist of more 
routine data transfers such as email, file transfers and Network File System (NFS) or 
Network Information Service (NIS) traffic. These packets of data can sustain some 


reasonable delays in transmission without any adverse affects on the applications. 


2. Fiber Distributed Data Interface Layers 


The standard for FDDI developed by the X3T9.5 committee included four layers 
shown in Figure 5. They are the Media Access control (MAC) layer, the Physical (PHY) 
layer, the Physical Medium Dependent (PMD) layer, and the Station Management (SMT) 
document [ALBE94]. 
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Figure 5: Relationship Between FDDI and ISO-OSI Layers 


The four layers of FDDI fall under the first two layers of the OSI Model. The 
physical layer of FDDI is specified in two documents: the FDDI PMD which defines the 
optical interconnecting components used to form links and the FDDI PHY which defines 
the encoding scheme used to represent data and control symbols. The DLL is also divided 
into two sublayers: A MAC and LLC layer. The MAC portion provides access to the 
medium, address recognition, and generation and verification of frame check sequences. 
The LLC specification is not part of the FDDI standard [MINO91). 

Below in Figure 6 is an additional graphical representation of the interaction 
between the FDDI standards as described in [POWE93]. 


a. The Physical Medium Dependent Layer 
This layer defines all transmitters, receivers, cables, connectors and other 


physical media and hardware. There are currently 6 media options provided for the PMD 
layer: 





+ Multimode fiber (PMD) 

¢ Single-mode fiber (SMF-PMD) 

« Low-cost fiber (LCF-PMD) 

« Shielded twisted pair (STP-PMD) 

¢ Unshielded twisted pair (UTP-PMD) 

¢ FDDI on Synchronous Optical Network (SONET) 


TEEE P802.2 LLC 


MAC 
- packet interpretation 
SMT : - token passing 
~ monitor ring - packet passing 
- manage ring 
- configure ring 
- manage connections 


PHY 
- encode/decode 
- clocking 


PMD 
- electronic/optic conversion 


Fiber out —_‘ Fiber in 
Figure 6: Block Diagram of the FDDI Layers 





The first three options are published or soon to be published standards. The 
last three options are under development [ALBE94]. 

The PMD layer provides the PHY layer all the services required to transport 
a coded bit stream from one node to the next node. It converts the encoded data requests 
from the PHY layer into either optical or electrical signals depending on the media being 
used. It also provides SMT with the needed services required for proper ring management. 
The PMD layer informs both the SMT and PHY layers whenever it detects a signal on the 
medium [ALBE94]. 








b. The Physical Layer 
This layer provides media independent functions associated with the OSI 
physical layer. The PHY layer decodes incoming bit stream into a symbol stream for use 
by the MAC layer and it encodes the data and control symbols provided by the MAC layer 
for transmission via the PMD layer. The PHY layer continuously monitors the ring status 


by listening to incoming signals and passes this information onto the SMT layer [ALBE94]. 


c. The Media Access Control Layer 

This layer provides fair and deterministic access to the network. The access 
is fair because a workstation’s physical location does not give it any advantage in accessing 
the medium over another workstation’s location. The service is deterministic implies that 
the time the workstation has to wait for the token can be predicted under error free 
conditions. 

In FDDI, medium access is controlled by a token. The workstation which 
possesses the token can transmit frames. The other workstations on the network repeat the 
frame, and the destination workstation copies the frame in addition to repeating it. The 
MAC layer of the workstation which generated the frame is responsible for removing the 
frame and passing the token downstream to the next workstation when it’s Token Holding 
Time (THT) has expired [ALBE94]. 


d. The Station Management Layer 
The SMT layer provides services such as node initialization, bypassing faulty 
nodes, coordination of node insertion and removal, fault isolation and recovery and 
collection of statistics. The SMT layer provides these functions using services provided by 
the PMD, PHY and MAC layers. 
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3. Fiber Distributed Data Interface Framing 


Most communications within FDDI is done on frames (Except Physical 
Connection Management (PCM) signaling). Within the MAC layer there are three frame 
types: 

¢ Tokens 


¢ Management frames 
¢ Data frames 


Each frame is made up of three parts. The first part is the start of the frame 
sequence. The next part is the data or information part of the frame. The last part is the end 
of the frame sequence. The data frame is shown in Figure 7 along with the size of each field 
in symbols [ALBE94]. 
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Sizes are in symbols 
1 symbol = 4 bits 


Total frame (minus information) size: 
40 symbols * 4 bits / 8 bits = 20 bytes 





Figure 7: FDDI Frame Format 


The start part of the frame is 28 symbols in length. Each symbol is a 4 bit unit. 
This means the start portion of the FDDI frame is 28 symbols * 4 bits / 8 bits = 14 bytes 
long. The end portion of the FDDI frame is 12 symbols or 6 bytes long. Since the maximum 
frame length is 9,000 symbols or 4,500 bytes, this leaves 4,480 bytes available for data or 
information. This remaining portion of 4,480 bytes, is also know as the FDDI Maximum 
Transfer Unit (MTU) value [ALBE94]. 


4. Encoding Method 

Digital data needs to be encoded for proper transmission.The type of encoding 
used is determined by the type of media being used. the desired data rate, noise present on 
the transmission media and other factors. Since FDDI was originally intended for use over 
fiber optics, the encoding method selected needed to provide a digital-to-analog capability. 

FDDI uses a two-stage encoding scheme; 4B/SB group encoding along with the 
digital signal encoding method known as Non-Retum to Zero Inverted (NRZI). NRZI is an 
example of differential encoding. The signal is decoded by comparing the polarity of 
adjacent signal elements rather than determining the absolute value of a signal element. In 
4B/5B, the encoding is done 4 bits at a time resulting 5 encoded bits. Then, each element 
of the 4B/5B stream is treated as a binary value and encoded using NRZI. 

The result is that FDDI is able to achieve a 100 Mbps throughput using a 125- 
MHz rate. As mentioned earlier, the PHY layer is responsible for decoding the 4B/5B 
NRZI signal from the network into symbols that can be recognized by the station. The 
synchronization is derived from the incoming signal and the data are then retimed to an 


internal clock through an elasticity buffer. 


E. NETWORK OVERHEAD 

The process of transferring data from one workstation to another involves all the layers 
of protocols described previously. Even though the protocols are broken into layers to 
distribute functionality, the result is increased overhead. As discussed earlier, for each layer 


of protocol, there is an associated overhead at that layer as shown in Figure 8. 
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Figure 8: Composition of FDDI Frames and Percentage of Overhead 











The amount of overhead involved in transferring data is dependent upon the protocols 


used and the network media being used as the transfer agent. For FDDI. the overhead is 


calculated as follows: 


Data Overhead Level Total Overhead 
4.440 bytes 0 Application 0 bytes 
4,440 bytes 20 bytes ; TCR: 20 bytes 
4,440 bytes 20 bytes IP 40 bytes 
4,440 bytes 20 bytes FDDI 60 bytes 


In this example, the frame of data being sent is 4,500 bytes: total amount of data being 
transferred is 4,440 bytes and total amount of overhead is 60 bytes. Therefore, the 
percentage of overhead is the amount of overhead (60 bytes) divided by the total frame size 
(4,500 bytes). Overhead = 60 bytes / 4,500 bytes = 1.33%. If we were to only send 11 bytes 
of data, then the overhead would be 60 bytes / 71 bytes = 84.5%. It is clear that the more 
data sent in each FDDI frame, the lower the percentage of overhead associated with that 
frame. Note that in this example the overhead from the application layer was not included. 


F, FIBER DATA DISTRIBUTED INTERFACE PARAMETERS 


This section will give a brief explanation of FDDI parameters as covered in the ANSI 
standards. The MAC layer must implement a number of thece parameters as timers and 
counters. The three main goals of these timers and counters are to [ALBE94]: 

¢ Allow the initialization of the token rotation timer 
¢ Permit fast recovery from ring errors 
¢ Aid in the collection of ring statistics for SMT 

Below in Figure 9 are a list of the important timer values and variables used in the data 
transmission process. According to the FDDI standards, every time a node releases a token, 
it loads the value of T_Opr into Token Rotation Timer (TRT). This timer then decrements 
until it reaches zero. If it reaches zero before a valid token is received, the token is said to 
be late and the late counter (Late_C?) is incremented. If TRT expires a second time before 


a valid token is received, an error condition exists and recovery procedures are initiated. 
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The token holding timer (THT) is used to control asynchronous transmission in a dynamic 
manner. When a valid token is received and the Late_Cr is not set. the token is said to be 
early and the node may transmit asynchronous data. In this case, THT is set to T_Opr minus 
TRT and the node may transmit until THT expiries. TVX is a hardware backup timer that 
is used to prevent nodes from blabbering on the network due to some error or 


miscalculation of THT [ALBE94}. 


Description 


Target token rotation time 
Token rotation timer 
Operative TTRT negotiated during claim process 


Late counter 
Token holding timer | 
Transmission valid timer 





Figure 9: Timers and Counters Used in Data Transmission 





Ill NETWORK EQUIPMENT 


A. NETWORK OVERVIEW 

The Naval Postgraduate School (NPS) FDDI research network consist of the three 
machines operating on a ring. The names of the three machines on the FDDI LAN are 
“Black”, “White” and “Gold”. Gold is the server on the network. The network is setup as 
shown in Figure 10. 


Gold White 
NPI SBus FDDI NPI SBus FDDI 
SMT7.2 V2.2 SMT7.2 V2.2 


Token Rotation 
and Data Flow 


S SGI xpi0 
B SGI FDDI SMT V3.0.1 





Figure 10: NPS’s FDDI Research Network 
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1. Fiber Optics Equipment 


The specifications for the fiber optics equipment can be found in the PMD 
standards. Originally, only optical fiber was specified as a physical media for FDDI. Now 
it is possible to also use shielded twisted-wire for short-distance transmissions. The 
Tequirements for twisted-wire can be found in the STP-PMD standards. 

The recommended fiber size for FDDI is 62.5/125 » m.The operating wavelength 
is specified as 1300 nm and the minimum‘allowable power for the transmitter is -16 dBm. 
Pin diodes are to be used in the link. Pin diodes were chosen over avalanche photodiodes 


since pin diodes are a more mature technology and would result in a lower cost receiver. 
The bit-error rate (BER) of the network is 4 x 10°! and the maximum number of nodes is 
500 [POWE93]. 


2. Network Peripherals’ Interface. 

The Network Peripherals Inc. (NPI) SBus FDDI Network Interface conforms to 
Sun Microsystems’ requirements for an SBus adapter. It mounts in a SBus slot and 
implements burst mode Direct Memory Access (DMA) for the highest system performance 
(NPI93}. 

As stated earlier, FDDI is designed to provide the capability for both synchronous 
and asynchronous data transfer. This is not the case with NPI’s SBus FDDI Interface card. 
Furthermore, it is not the case for all known current implementations of FDDI. This makes 
the relationship of the timers and counters described earlier not as well defined. Without 
synchronous and asynchronous transfers, there is no need for Late_Cz and THT. Below is 
a list of parameters which NPI list as its tunable parameters. Note that there is not a 
parameter listed here which specifies how long a node can maintain the token. 


Sbf_num_llc_rx /* For LLC network traffic: 
/* number of 4k receive buffers, maximum is 64 4k buffers 
/* Default is 48 4k buffers per NP-SB adapter 


sbf_num_smt_rx —_ /* For SMT network traffic: 
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/* number of 4k receive buffers, maximum is 64 4k buffers 
/* Default is 4 4k buffers per NP-SB adapter 


Sbf_mtu /* Maximum protocol packet size, default is 4352 bytes 
shf_T_Notify /* SMT Neighbor Notification Timer, default is 30 seconds 


Sbf_num_mcast /* number of multicast entries, default is 16 


These parameters can be tuned by entering the appropriate line below in / 


etc/system for each parameter. 


1. To change number of receive buffers to 64: 
set Sbf:sbf_num_lic_rx = 64 

2. To change MTU size to 4192 bytes: 
set sbf:sbf_mtu = 4192 

3. To change T_Notify timer to 10 seconds: 
set sbf:sbf_T Notify = 10 


After contacting NPI it was learned that there is another parameter which is not 
advertised called t_req. This parameter determines how long the node is allowed to ho!“ 
the token. 


3. Silicon Graphic’s Interface 


FDDIXPress™ 3.0.1 is a network interface controller (board and software) 
providing FDDI connectivity for Silicon Graphics workstations and servers. For the IRIS 
Indigo. FDD[XPress has two configurations of the FDDI board: FDDIXPI and FDDIXPID. 
The FDDIXPI board allows one single-attachment FDDI connection to an FDDI 
concentrator; the FDDLXPID board provides a dual-attachment FDDI connection directly 
to the dual ring, or one or two connections to an FDDI concentrator. An Indigo can 


accommodate one of these boards. 
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When FDD[XPress is installed, an Indigo can also use its built-in Ethernet 
network interface, thus having two network interfaces. FDD[XPress for IRIS Indigo has 


been designed for customer installation. 
B. WORKSTATION OVERVIEW 


1. SUN SPARCstation 10 system 


The SPARCstation 10 systems used in this test were the new multiprocessing 
systems running Solaris 2.3: We had two SPARCstation 10 systems, Gold and White, 
available for our FDDI research. Both systems have two processors, two internal hard disk 
drives and 224 Dynamic Random Access Memory (DRAM). Gold has two 50MHz 
processors and 2 - | GB intemal drives. White has two 40MHz processors, | -1 GB internal 
drive and 1-425 MB internal drive. 


a. Software Architecture 


Solaris 2.3 is a multilayered operating system that includes SunOS 5.3, Open 
Network Computing (ONC), Open Windows, and the DeskSet. At the core of Solaris is 
SunOS, the collection of programs that actually manages the system, which includes the 
kernel, the file system, and the shells. 


SunOS is a collection of UNIX programs that control the Sun workstation and 
provide a link between the user, the workstation, and its resources. It has its roots firmly 
placed in the two most popular UNIX families: Berkeley UNIX (BSD) and AT&T’s UNIX. 
Early versions of SunOS blended some of AT&T’s UNIX with Berkeley UNIX and offered 
additional enhancements. 

AT&T and Sun Microsystems later worked together to create a new industry 
Standard, AT&T UNIX System V Release 4, commonly known as SVR4. SunOS 5.3 
merges SunOS 4.1 and SVR4. Most of the new changes in SunOS come from SVR4. As a 
result, Solaris 2.3 is based on SVR4 but contains a few additional BSD/SunOS features 
(HESL93). 














b. Hardware Architecture 
The SPARCstation 10 architecture is shown in Figure 11 [SUNM90]: 
SuperSPARC microprocessor This is a high-performance CPU chip that 
has the following features: 


¢ A single chip with integer, floating point, memory management, and caches. 
¢ Superscalar pipeline with up to three instructions launched per clock cycle. 
¢ 20-Kbyte instruction cache and 16-Kbyte data cache. 

¢ 64 entry TLB with hardware page-table walking. 

¢ Integral support for cache-coherent multiprocessing. 


The SuperSPARC processor has a companion chip, the SuperCache 
controller, which provides for a 1-Mbyte external cache. Additionally, SPARC modules 
with SuperCache controllers can operate asynchronous to the system clock. 

MBus. The MBus is a high performance memory bus which was first 
introduced in Sun’s SPARCserver 600MP family. It is a synchronous, 40-MHz 64-bit bus 
that is capable of a peak transfer rate of 320 Mbytes/second. Typically, the MBus can 
sustain a rate of 100 Mbytes/second. 

This bus provides support for symmetric multiprocessing by means of a 
“snooping” protocol. Whenever a processor puts an address onto the MBus, all other 
processors “snoop” the bus, checking to see if data at the snooped address is in their cache. 

Main memory architecture: The Sun-4m architecture uses a 144 bit wide 
memory data path (128 bits of data and 16 bits of error detection and correction). The use 
of a 128-bit wide memory data has two advantages. First, the 32-byte cache fill can be 
accomplished quickly. Second, error corrections can be performed on each 64-bit word. 
Single bit errors can be corrected and double-bit (4-bit) errors can be detected. 
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Figure 11: Sun-4m Architecture Used in the SPARCstation 10 System 


V/O architecture: A single Application-Specific Integrated Circuit (ASIC) 
serves as the interface between the MBus and the SBus. The MBus is used as the processor 
memory interconnect, while the SBus is used only for I/O. The SPARCstation 10 system 
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supports four SBus slots. They provide the means to interface a variety of I/O options, 


including network interfaces such as FDDI, graphics adapters and Saser printer interfaces. 


2. Silicon Graphics IRIS Indigo 


The Silicon Graphics IRIS Indigo used in this test was an IRIS-4D™, model 4D/ 
RPC. The IRIS Indigo uses the R3000A CPU RISC processor from MIPS Computer 
Systems Inc. It is assisted by a 32 Kbyte data and instruction cache and a MIPS R3010A 
floating-point unit. To speed up data transfers, IRIS Indigo uses custom ASICs designed 
by Silicon Graphics. These chips manage memory and processor interrupts, handle I/O and 
control the bus, often without CPU intervention [SILIC91]. 

We had one IRIS Indigo, Black, available for our FDDI research. This system has 
one 33 MHz processor, one 1 GB internal hard disk drive and 32 Mbytes of RAM. The 
workstation has the following features: 


¢ A single 33 MHz chip with integer, floating point, memory management, and 
caches. 

 32-Kbyte instruction cache and 32-Kbyte data cache. 

« Integral support for cache-coherent multiprocessing 


a. Software Architecture. 
The IRIS Indigo uses IRIX 4.0 which is Silicon Graphics’ implementation of 
the UNIX operating system. IRIX 4.0 is based on AT&T UNIX System V.3, but also 
includes numerous 4.3 BSD extensions, such as TCP/IP network protocols and NFS, which 


provide transparent access to files across a heterogeneous network 


b. Hardware Architecture. 
This IRIS Indigo CPU board, Figure 12 [SILIC91], contains four functional 
sections: 


¢ The processor core, which contains the CPU and FPU. 

¢ Main memory, which contains DRAM and supporting circuitry 

¢ The I/O system, which contains peripheral ports and hardware designed to read 
incoming data, manage incoming and outgoing data 

¢ The audio system, which contains audio ports and digital signal processi.ig 
hardware. 
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Figure 12: The IRIS Indigo CPU Board 


Three busses connect parts of the CPU board: 
¢ The CPU bus, which connects the CPU, FPU, cache control, and bus control 


hardware. 
* The GIO32 bus, which is the main system bus connecting the processor core, 
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main memory, [I/O system, expansion slots, and graphics board. 


¢ The Peripheral bus, which connects the peripheral ports, audio system, and 
other 1/O components. 


The CPU bus and the GI032 bus have separate clocks and run at different 
speeds so that each part runs at maximum capability. The CPU and other chips can be 


upgraded independently as technology improves. 


Instruction and Data Caches. Each cache is a 32 Kbyte cache.The 
instruction cache holds frequently used instructions and the data cache holds frequently 
used data. The IRIS Indigo uses a write-through scheme in the data cache to ensure that 


writes made to the cache are also written to the corresponding page in main memory. 


The GI032 Bus. This bus is the IRIS Indigo’s main system bus, and is 
designed for high speed data transfer. It connects the main systems of IRIS Indigo; the 
processor core, main memory, the I/O systems, the graphics system, and any systems 
plugged into the expansion slots.This bus is a synchronous, multiplexed address/data, burst 
mode bus that operates at 33.3 MHz, clocked independently of the CPU. The bus protocol 


supports data transfers at a maximum sustained rate of one word per clock. 


The I/O System. The I/O system ties together a variety of I/O ports and the 
chips that drive them, a system clock, system Programmable Read-Only Memory (PROM) 
for booting up, an static RAM. 


The HPC!1 ASIC. The HPC1 is a custom Silicon Graphics chip that connects 
to the GI032 bus, the peripheral bus, and directly to several of the I/O ports. It is the heart 
of the I/O system, and quickly transfers data between main memory and a rich collection 
of peripheral devices. 

Expansion Slots. The two expansion slots, connected directly to the GI032 
bus, provide direct access to the system for Silicon Graphics and third party plug-in boards 
for such applications as high-speed networking, image compression, video deck control, 


and additional [/O. Slot 0 is used for our FDDI connection. 
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IV. TEST DESIGN PLAN 


A. TEST STRATEGY 

The objective is to find the upper limit of throughput by measuring actual throughput 
between high performance workstations over an FDDI network and to determine what 
bottlenecks, if any, exits between Sun Microsystem SPARC 10 multiprocessors running 
the Solaris 2.3 and NPI’s FDDI network interface cards. This process will include 
identifying the various parameters which affect throughput and testing these parameters in 
enough detail to determine their impact on network performance. As explained in Chapter 
Il, there are various levels of software that are involved in transferring data. As shown in 
Figure 13, as data is transferred from White to Gold, there are several impacts on the data 
transfer rate. 

The key to this test design plan will be gathering the appropriate data to determine 
what impact these various parameters have on the transfer rate, and how to measure them. 
Three different methods will be used to measure the performance of data being transferred 
between workstations across the FDDI network. First, a commercial benchmarking tool 
will be used to provide performance results on the workstations. Second, a public domain 
networking benchmark tool will be used to show the transfer rate of the network. Third, a 
simple program which issues an rcp command and measures the time of the file transfer 
will be used. 


B. NEAL NELSON BENCHMARK 

The primary benchmarking tool to be used for providing the performance results on 
the workstations will be the Neal Nelson Business Benchmark™. This benchmark tool has 
been around for over 9 years and has been used as a tool for verifying vendor compliance 
during government contract awards. The Business Benchmark differs from other popular 
benchmarks in that its primary focus is not to provide a single number speed rating for a 
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system, nor is its primary purpose to emulate a particular user group or duplicate the load 
created by certain task mix. The Business Benchmark was designed to incrementally stress 
various parts of a computer system and record how the system performs. The benchmark 
was intended to uncover both the strengths and the weaknesses of a computer architecture 


and report them separately so that they can be understood and analyzed [GRAY91]. 
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Figure 13: Flow of Data Across the FDDI Network Using the RCP Command 


The Neal Nelson Business Benchmark is a multitasking benchmark with a parent/child 
design. A parent process creates child processes and instructs them to run tests in various 
combinations. There can be from one to one hundred child processes running 
simultaneously during a benchmark session. During a test session the parent process creates 
a single child process and instructs the child to perform a series of tests. Then the parent 


creates a second child and directs both children through the same series of tests. This 
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process is repeated until a desired maximum number of child processes is reached. or until 
the system runs out of some resource such as disk space [NNBM9Y4}. 


The benchmark consists of thirty tests, which are divided into three groups. 


Group |: Tests a of mix of activities that are intended to approximate the processing 


activities for the following five types of users. Group | includes the following tests: 


e 


1) Simulated Office Automation Workload 

2) Simulated Database Workload 

3) Simulated Software Development Workload 

4) Simulated Transaction Processing Workload 

5) Simulated Calculation Workload (Math/Statistics/CAD/CAM) 


Group 2: Tests designed to perform various types of calculation tasks and thereby 
profile the performance of the computer’s calculation subsystem. Group 2 includes the 
following tests: 


6) Write to Shared Memory 

7) Read from Memory, Small Instruction Area, Small Data Area 
8) Read from Memory, Small Instruction Area, Larger Data Area 
9) Read from Memory, Larger Instruction Area, Small Data Area 
10) Read from Memory, Larger Instruction Area, Larger Data Area 
11) Make Machine Page or Swap with ‘malloc’ and ‘free’ 

12) Combined Integer and Floating Point Math 

13) Math Library Functions 

14) Semaphores, Shared Memory, Context Switch 

15) Write to and Read from Pipes, Context Switch 

16) Sample System Calls 

17) Increasing Depth of Function Calls 


Group 3: Tests that perform a series of disk input and output functions to profile the 
performance of the disk subsystem. Group 3 includes the following tests: 


18) 1024 byte Sequential Reads from Unix File(s) 
19) 1024 byte Sequential Writes from Unix File(s) 


30 








20) 8192 byte Sequential Reads from Unix Files(s) 

21) 3192 byte Sequential Writes to Unix File(s) 

22) 4096 byte Synchronized Reads from Unix File(s) 

23) 4096 byte Synchronized Reads from Raw Device(s) 
24) 16384 byte Synchronized Reads from Unix File(s) 
25) 16384 byte Synchronized Reads from Raw Device(s) 
26) 4096 byte Pseudo Random Reads from Unix File(s) 
27) 4096 byte Pseudo Random Reads from Raw Device(s) 
28) Profile Disk Cache for Unix File(s) 

29) Profile Disk Cache for Raw Device(s) 

30) 8192 byte Sequential Writes then ‘sync’ 


During each of the above tests, measures will be obtained at load factors from 1 to 20. 
This load factor number indicates the number of copies of the benchmark program which 
were running simultaneously. Each load factor unit might approximate the workload of one 
or two heavy users or possibly twenty light users. The measurements will be in seconds to 
complete the measured task. The system which takes less time to accomplish the measured 
task is the faster system. 


C. NEW TEST TRANSMISSION CONTROL PROTOCOL 


New Test TCP (nttcp) uses Test TCP (tcp) as the basic tool for determining measured 
throughput over any physical network media. nttcp provides the option of dynamically 
changing the TCP/IP window size during the throughput test. tcp was developed by the U. 
S. Army’s Ballistic Research Lab (BRL) which is now the U. S. Army’s Research Lab 
(ARL) and is considered one of the default network performance benchmarks. 

nttcp tests TCP and UDP performance by timing the transmission and reception of 
data between two systems using the UDP or TCP protocols. It differs from common “blast” 
tests, which tend to measure the remote inetd as much as the network performance, and 
which usually do not allow measurements at the remote end of a UDP transmission. 

For testing. the transmitter should be started with -t after the receiver has been started 


with -r. For testing various window sizes, nttcp allows a -w option which permits the user 
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to specify the desired TCP/IP window size. Some of the other options which were used 


during this investigation are shown below: 


-t Transmit mode. 

Sf Receive mode. 

-u Use UDP instead of TCP. ° 

-n Number of source buffers transmitted. 
-l Length of buffers in bytes. 

-w TCP/IP window size in k bytes. 

-p Port number to send to or listen on. 


Below are the commands used in a typical session during this investigation: 


Receiving system (gold): 

gold: nitcp -r -p3000 -w12 

Transmitting system (white): 

white: nttcp -t -p3000 -165536 -n1024 -w12 gold 

The shell scripts along with the nttcp program are in Appendix A. The shell scripts 
doit.sh and ttest.sh were written by personne] at the U. S. Army Research Lab (ARL) and 
modified to fit this investigation. These scripts were designed to be used with the program 
nttcp. The first script, doit.sh, provides the various combinations of data sizes to be 
transferred along with starting and stopping times of each run. This script runs through six 
iterations of identical data sets. The shell script test.sh, provides the calls to the program 
nttcp. Using the data length and number of packets specified in the shell script doit-sh, 
ttest.sh makes numerous calls to nttcp varying the window size from 4 k to 60 k in 8 k 
increments. This combination of amount of data transferred, number of test runs and 
number of window sizes provides a total of 576 measured data transfers during a single run. 
Amount of data transferred (12 sizes) * number of test runs (6 runs) * number of window 
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sizes (8 different window sizes) = 576 measured data transfers. Below is an example of the 
results from a single call to nttcp with the amount of data to be transferred equal to 
33,554,432 bytes of data and the TCP/IP window size being varied from 4 k to 60 k in 8 k 


increments: 
Window Size(bytes) Transfer Rate (Mb/s) 
4096 « 32.7680 
12288 29.1271 
20480 37.4491 
28672 43.6907 
36864 52.4288 
45056 43.6907 
$3248 43.6907 
61440 37.4491 


The TCP/IP window size is adjusted during these runs using the setsockopt system 
call. After the window size has been adjusted, the getsockopt system call is performed to 
verify that the TCP/IP window size has been changed as requested. Figure 14 shows an 
example of the setsockopt and getsockopt system calls used in the nttcp program. 


if (setsockopt (fd, SOL_SOCKET, SO_SNDBUF, (char *) &sendwin, sizeof(sendwin)) < 0 ) 
printf(“get send window size didn’t work\n”); 
if (setsockopt (fd, SOL_SOCKET, SO_RCVBUF, (char *) &rcvwin, sizeof(rcvwin)) < 0 ) 
printf(“get rcv window size didn’t work\n”); 


if (getsockopt (fd, SOL_SOCKET, SO_LRCVBUF, (char *) &sendwin, &optien) < 0 ) 


printf(“get send window size didn’t work\n”); 

else printf(“send window size = %d\n”, sendwin); 

if (getsockopt (fd. SOL_SOCKET, SO_LRCVBUF., (char *) &rcvwin, &optien) < 0) 
printf(“get rev window size didn’t work\n”); 

else printf(“receive window size = %d\n”, rcvwin); 





Figure 14: Example of setsockopt and getsockopt System Calls 
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D. REMOTE COPY PROTOCOL TRANSFER 

Another program being used to measure the data wansfer rate is a simple C program 
which issues a rcp command transferring a file from one workstation to another (Appendix 
B). The primary reason for choosing the rcp command is that it uses TCP which is a reliable 
transfer agent versus UDP which is unreliable. By using the rcp command, we are able to 
measure the time from the rcp command being issued to the time the ack is received back 
from the other workstation. The system can access the clock prior to issuing the rcp 
command, and then again after it receives the ack from the other workstation. Since the rcp 
provides for reliable data transfer, this allows a measurement of the total transfer time. 
Figure 15 shows the code obtaining the current system time, issuing the rcp command and 


then obtaining the system time again after the transfer is complete. 


a = gettimeofday(&timestart, zonestart); 
if (a !=0) 
printf (“Oops ! %d\n", a); 


/* Use system call to do file transfer */ 
system (“rcp large_file gold-fddi:/usr/test/gtow_test”); 


/* Get stop time in sec&usec and check if successful */ 


b = gettimeofday(&timedone, zonedone); 
if (b != 0) 
printf ("Oops! %d\n", b); 


Figure 15: Implementation of RCP System Call 





This method includes all the overhead from the operating system, rcp, TCP, IP and 
FDDI. After the rcp command is issued, the file is located in the file system and loaded into 
memory. Next, the workstation from which the command is being executed must perform 
a name/address resolution to determine where the file is being transferred. DNS provides 
this name/address resolution. Once this name/address resolution is performed the file is 
handed off to TCP to begin the transfer from workstation A to workstation B. TCP hands 
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the file transfer off to IP which forwards the file to the FDDI protocol. At this point the 
FDDI SBus card transfers the file from workstation A to workstation B. At workstation B 
the reverse scenario takes place. The file is handed off from the FDDI protocol to the IP 
protocol, to the TCP protocol, and finally reaches the OS on workstation B. At this point, 
TCP on workstation B must issue an ack to let workstation A know that the file has been 
correctly received and handed off to the OS. 

The rcp command copies files petwees machines. Each filename or directory 


argument is either a remote file name of the form: 
hostname:path 

or a local file name (Containing no: characters, or a / before any: characters). 

If a filename is not a full path name, it is interpreted relative to the users home 
directory on hostname. A path on a remote host may be quoted (using \, ", or ') so that the 
metacharacters are interpreted remotely. 

rcp does not prompt for passwords; your current local user name must exist on 
hostname and allow remote command execution by rsh. 

rcp handles third party copies, where neither source nor target files are on the current 
machine. Hostnames may also take the form 


username@hostname:filename 


To use username rather than your current local user name as the user name on the 


remote host. rcp also supports Internet domain addressing of the remote host, so that: 


username@host.domain:filename 
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Specifies the username to be used, the hostname. and the domain in which that host 
resides. Filenames that are not full path names will be interpreted relative to the home 
directory of the user named username, on the remote host. 

E. PARAMETERS WHICH AFFECT BOTH TEST 
The following driver parameters will be tuned under Solaris 2.3. 
Sbf_num_lic_rx  /* For LLC network traffic: 


/* Number of 4k receive buffers, maximum is 64 4k buffers 
/* Default is 48 4k buffers per NP-SB adapter 


nfs_async_threads /* Number of NFS thread for handling network file service 
/* Default is 8 


Sbf_treq /* Amount of time for TTRT, default is 8ms 
/* Range is from 2ms to 165ms 


Sbf_mtu /* Maximum protocol packet size, default is 4352 bytes 





The above 4 tunable parameters along with the TCP/IP window size will be varied 
during the rcp and nttcp transfer test. The TCP/IP window size controls the amount of data 
permitted to be transferred between TCP acknowlegments. Numerous tests will be run 
varing each of the four parameters to determine what combination of values provides the 
optimum throughput performance and what weight each parameter has on the changes. The 
baseline test will be the values the manufacture recommends as the default values. 


F. FILE SIZES FOR BOTH TRANSFERS 

In order to measure the impact of the TCP, IP and FDDI overhead during the test, 
various sizes of files will be transferred. For the rcp test, the properties of the four files to 
be used are shown in TABLE 1. These files range in size from 6 bytes to 17,989,936 bytes. 
The amount of overhead during the transfers can be estimated as follows: 

For the nttcp test, the amounts of data to be transferred is shown is TABLE 2. The 
amounts of data to be transferred is obtained by specifying the length of a buffer to be 
transferred and the number of buffers. As an example, if 2048 buffers of length 8192 bytes 
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are transferred, then a total of 16,777,216 bytes of data are being transferred. The 
combinations listed in TABLE 2 give a range from 4,194,304 bytes to 2.684354e+08 bytes 
being transferred. 


TABLE 1: RCP FILE SIZES AND ASSOCIATED OVERHEAD 


— 
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In order to make it easier to reference which file size has been used in the various test, 
the files will be referred to as File A through File H with File A being the smallest file, 
4,194,304 bytes, and File H being the largest file, 268,435,400 bytes. The rest of the files 
are in order of size from the smallest file to the largest file. 
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G. SYSTEM CONFIGURATIONS FOR ALL TESTS 


As described in the previous sections, various tunable parameters and file sizes will be 
used during this investigation. In order to obtain reliable results, numerous test must be 
conducted to achieve a comfortable confidence level. Unfortunately, it is not practicable to 
perform all the test runs necessary to test all combinations possible let alone run enough 
iterations of each test to obtain the desired confidence level in the results. 

As an example, just running the various combinations of tests described earlier with 


the nttcp program, there were 576 measured data transfers during a single run. One such 
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test took a combined total of 3 hours and 15 minutes to run. During inital runs of the aticp 


program, the TCP/IP window size was varied in 4 k increments. It was determined that 
there was little difference between the individual! transfer rates of 4 k window sizes. 
Therefore, follow-on test were run at intervals of § k window sizes. This change reduced 
the run times from over 6 hours to just over 3 hours with little to no loss of usable results. 

As noted earlier, there are other tunable parameters which can be modified by using 
the set command in the /etc/system file. Once again, it is not possible to test all possible 
combinations of parameters. As an example, if we start with the 576 measured data 
transfers which took over 6 hours with a 4 k TCP/IP window size increment, then test the 
TTRT parameter at 5 ms increments (33 tests), then the sbf_nwm_Ilc_rx buffers at 4 k 
increments (15 test), then the sbf_num_smt_rx buffers at 4 k increments (15 tests) and 
assume that we would like a confidence level which requires 50 runs of each test, we would 
have a total of 33*15*15*50 = 371250 tests needed to reach any conclusions. If each test 
took over six hours to conduct, it would take a total of 2,227,500 hours or 92,812.5 days 
just to finish conducting the tests. 

In his book [JAIN91], Raj Jain discusses this dilemma of having too many variables 
to consider. The solution is to first get a gross picture of the impact of changing selective 
parameters. Once a parameter’s impact on performance has been determined, then more 
thorough testing can be conducted by adjusting the correct parameters to obtain the desired 
confidence level. An example of this method in practice is changing from 4 k intervals in 
the TCP/IP window size to 8 k windows sizes. 

In addition to the tunable parameters already discussed, this investigation is looking 
into the impact of the workstations running in multiprocessor modes and using a recently 
developed operating system, Solaris 2.3. This now doubles the required testing! First, tests 
will be conducted in the two processor configuration. Then, each Sun SPARCstation will 
be tested with only a single process, but still running Solaris. Once again, it is not possible 
to test all possible tunable parameters especially in both hardware configurations. Once a 
pattern has been established in the single processor configuration, follow-on tests in the 
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multi-processor hardware configuration will be focused to limit the scope of tests to 


changing those parameters which produce the best results. 


H. PARAMETER BASELINE 

First, a baseline condition must be established before any changes are made to the 
system. This baseline will be with the following parameter values shown in TABLE 3. This 
table pertains more to the parameter settings in the nttcp and rcp test than the Neal Nelson 
Benchmark test. The first parameter, NFS_asynch_threads, has an impact on all three test. 
The other three parameters only impact the results of the nitcp and rcp test. No changes will 
be made to the workstations other than the changes to the tunable parameters listed below. 
Stored with the results of each nttcp and rcp test run is a README file with the below 
parameters and their values for that test. 

While the below parameters are changed for the nttcp and rcp test, the TCP/IP window 
size will also be varied. The TCP/IP window size is not listed below in TABLE 3 as a 
tunable parameter. It is being treated differently due to the method it is varied during the 
test transfers. The nttcp program will be varying the TCP/IP window size during the test 
whereas the below listed tunable parameters must be changed by rebooting the 


workstations in-between the various tests. 


TABLE 3: DEFAULT PARAMETERS USED FOR ALL THREE TEST 






Neal Nelson 
— 
ee Le eee 





Below is a review of the parameter descriptions: 
Sbf_num_lIlc_rx /* For LLC network traffic. Number of 4k receive buffers 


/* maximum is 64 4k buffers 
Sbf_mtu /* Maximum protocol packet size, default is 4352 bytes 
t_req /* Token holding time, default is 8ms 


nfs_asynch_threads /* For NFS service. Number of threads alloted. Default is 8 
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The results of the initial armcp baseline test during the single processor test are shown 
below in TABLE 4. The results shown in this table are the averaged results obtained from 
running this test for six runs. The first column shows the TCP/IP window size used during 
the test. The next 8 columns which are labeled File A through File H. show the averaged 


measured throughput in Mbps achieved during this test run. 


TABLE 4: TEST RESULTS IN SINGLE PROCESSOR MODE 


Window Size ile H 
"cr [Ws | Mi | | | | |e 
[| 40.05 | 36.06 | 32.46 | 31.92 | 31.51 | 31.96 | 

ais i a es anes asa a ay 
| 20 32.77 | 43.69 | 41.87 | 4057 | 40.57 | 4033 | 40.62 | 39.86 | 
| 8 32-77 | 49s | 38.23 | 4265 | 40.57 | 40.89 | 41.67 | 41.81 | 
| 6 | 32-77 | 43.69 | 38.23 | 43.69 | 41.61 | 40.89 | 41.67 | 42.38 
| 4 | 32.77 | 49.15 | 38.23 | 4265 | 40.57 | 39.43 | 42.26 | 42.09 | 
P82 76.96 | 49.15 | 38.23 | 41.61 | 38.75 | 37.93 | 39.35 | 36.15 | 
| 0 32.77 | 43.69 | 38.23 | 41.61 | 33.72 | 34.37 | 30.09 | 30.60 | 


























V. TEST RESULTS AND ANALYSIS 


In this chapter, the results from the three tests discussed in Chapter IV will be 
presented. First, the results from the Neal Nelson Benchmark tests will] be presented. These 
results will show that the newer, faster SOMHz processors should outperform the older 
40MHz processors. Next, the results from the New Test TCP (nttcp) network throughput 
tests will be presented. These results will show under what conditions the highest 
throughput can be achieved and what throughput bottlenecks exists. Last, the results from 
the rcp transfer tests will be presented. These results will help to identify bottlenecks within 
the workstation as a whole. The nttcp tests directly access the TCP/IP layer and d’ 10t 


provide a true measure of all the overhead present in distributed processing. 


A. NEAL NELSON BENCHMARK 


The Neal Nelson Benchmark is the tool being used to measure the capabilities of the 
workstations and the operating systems being tested. It is important to verify that the 
hardware we believe will perform faster has been verified to perform faster. 

To begin with, two system disks were configured with the Solaris 2.3 operating system 
and one system disk was configured with the SunOS 4.1.3 operating system. A three 
gigabyte disk was partitioned and half of it made into a Unix file system, leaving the other 
half as a raw disk partition. The source code for the benchmark was obtained, installed, and 
compiled under Solaris 2.3 and SunOS 4.1.3 with the default tuning parameters. 

The benchmark was started in the background and took approximately 20 hours to run 
under each of the following four hardware configurations: Gold with two SOHMz 
processors and White with two 40MHz processors, each running Solaris 2.3; Gold with one 
5O0MHz processor running Solaris 2.3; Gold with one 50MHz processor running SunOS 
4.1.3. Solaris 2.3 is Sun Microsystem’s new operating system based on AT&T System V 
unix while SunOS 4.1.3 is based on Berkley’s unix. 
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Once the benchmark testing was completed, the results were collected and 
electronically mailed to Neal Nelson & Associates, where the test reports were generated. 
The results from the three different configurations discussed below are listed in Appendix 
C with approval from Neal Nelson & Associates. 


1. Gold Versus White, Two Processors and Solaris 2.3 
In group | tests, which are intended to approximate the processing activities of 


five types of users, Gold consistently performed the tasks approximately 20 percent faster 
than White. 


White2.sol” dies 
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Figure 16: Gold Versus White, Two Processors 


In group 2 tests, which are designed to perform various types of calculation tasks 
and thereby profile the performance of the computer’s calculation subsystem, Gold 
continued to perform the tasks approximately 20 percent faster than White. 
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In group 3 tests, which performed a series of disk input and output functions to 
profile the performance of the disk subsystem, the results were mixed, but Gold still 
outperformed White on the average. These results varied from Gold outperforming White 
an average of 20 percent, to times when White outperformed Gold. 

In Figure 16 on page 42 are the graphical results of Test 1, Simulated Office 
Automation Workload. Gold, with two 50MHz processors running Solaris 2.3, clearly took 
less time to perform the test than White with two 40MHz processors running Solaris 2.3 
except at a load of 11. Once again, a load can signify either several light users or a single 


heavy user. As the loads increase you have either more light users or multiple heavy users. 
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Figure 17: Gold One Processor Versus Gold Two Processors 


2. Gold One Processor Versus Gold Two Processors and Solaris 2.3 


In group | tests, the two processor configuration consistently outperformed the 


single processor configuration by 80 to 90 percent. 
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In group 2 tests, the two processor configuration continued to outperform the 
Single processor configuration by 80 to YO percent in all areas but one. In test 14, 
Semaphores, Shared Memory and Context Switch, the two processor configuration only 
outperformed the single processor configuration by 5 to 7 percent. 

In group 3 tests, the results were once again mixed. The two processor 
configuration outperformed the single processor configuration in all tests but three by 50 
percent. In test 19, 1024 byte Sequential Writes from Unix File(s) and test 21, 3192 byte 
Sequential Writes to Unix File(s), the single processor outperformed the two processor 
configuration by an average of over 200 percent. In test 30, 8192 byte Sequential Writes 
then ‘sync’, the single processor configuration outperformed the two processor 
configuration by approximately 20 percent. 

In Figure 17 on page 43 are the graphical results of Test 1, Simulated Office 
Automation Workload. Goid with one 50MHz processor running Solaris 2.3 clearly took 


more time to perform the test than Gold with two SOMHz processors running Solaris 2.3. 


3. Gold With One Processor, Solaris 2.3 Versus SunOS 4.1.3 


In group | tests, the results were once again varied. SunOS 4.1.3 outperformed 
Solaris 2.3 in 4 of the 5 tests at the higher load levels by 3 to 4 percent. Solaris 2.3 
outperformed SunOS 4.1.3 in two of the test at the lighter load levels by 3 to 4 percent. 

In group 2 test, the results were more consistently in favor of SunOS 4.1.3. In 7 
of the 12 test, SunOS 4.1.3 outperformed Solaris 2.3 by 4 to 5 percent. In test 13, Math 
Library Functions, SunOS 4.1.3 outperformed Solaris 2.3 by an average of 40 percent. 
Solaris 2.3 only outperformed SunOS 4.1.3 in three of the test areas. Two of the areas the 
percent was once again, only by 2 to 3 percent. In test 17, Increasing Depth of Function 
Calls, Solaris 2.3 outperformed SunOS 4.1.3 by an average of 40 to 50 percent. 

In group 3 tests, the results were once again varied. In 6 of the tests, SunOS 4.1.3 
outperformed Solaris 2.3 by anywhere from 15 to over 500 percent. In seven of the tests, 
Solaris 2.3 outperformed SunOS by anywhere from 100 to over 400 percent. Once again 
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though, it appears that SunOS 4.1.3 came out slightly ahead in the high load area over 
Solaris 2.3 

Below in Figure 18 are the graphical results of Test 1, Simulated Office 
Automation Workload. Gold with one 50MHz processor running SunOS 4.1.3 slightly beat 
out Gold with one SOMHz processor running Solaris 2.3 at the higher loads. 
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Figure 18: Gold, One Processor, SunOS 4.1.3 Versus Solaris 2.3 


B. NEW TEST TRANSMISSION CONTROL PROTOCOL 

As discussed in Chapter IV, the file sizes used during the test runs with New Test TCP 
(nttcp) are shown below in TABLE 5. The files are created by specifying the length of the 
buffer to be created and the number of buffers to be sent. The files will be referred to as File 
A through File H with File A being the smallest file, 4,194,304 bytes, and File H being the 
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largest file, 268,435,400 bytes. The rest of the files are in order of size from the smallest 


file to the largest file. 


TABLE 5: FILES (DATA SIZES) FOR NTTCP TEST 


length of Buffers: 8192 bytes ~ 65536 bytes 
(Files A - D) (Files E -H) 
Number of Buffers 
Pid ILA 4194004 byes FILEE 33554432 byies 
me 1004 _FILEB 8388608 bytes [FILE F 67108864 bytes LE F 67108864 bytes 
a el 1 2 


F496 FILED s585e032 byes FILEH 2.08035eer08 bytes 











After conducting several test runs and observing the results, it became obvious that 
some smaller file sizes were not large enough to obtain accurate results. Whenever data is 
transferred using the nttcp program, the actual CPU time is the time used for calculating the 
throughput. If the CPU time used is too small, less than 0.1 seconds, the results become 
unreliable. An example of an unreliable transfer rate is given below in Figure 19. The 
reason for the inaccurate throughput result is the small amount of CPU time taken during 
this data transfer. 

Transfers using the number of buffers = 512 and the length of buffer = 8192 were the 
only ones which had the unreliable transfer rates. There were typically only one or two 
transfer rates in each test which were unreliable. However, the window size was not always 
the same at which the unreliable transfer rate occurred. Therefore, the results of File A 
transfers were not used in this analysis. 


send window size = = 12288 


receive window size = 12288 
ttcp-r: 4194304 bytes in 0.06 real seconds = 68266.67 KB/sec = 546.1333 Mb/s 





Figure 19: NTTCP Output for File Size of 4194304 Bytes 














1. Single Processor Results 


The first 32 test were run while Gold and White were set up in a single-processor 
configuration running Solaris 2.3. These 32 test represent a small subset of all possible 
tunable parameter combinations. The primary focus of this first set of test was to determine 
the effect of modifying the TCP/IP window size, the nfs_async_threads and the t_req 
parameters. Additionally, tests were conducted transferring data from White to Gold, Gold 
to White and both ways simultaneously. The 32 tests and the values of the tunable 
parameters are listed in TABLE 36, Appendix D. 

The data gathered in the above 32 tests was analyzed using multiple linear 
regression analysis according to the model y = B, +B,+, +B,x, +... +B,x, +e which relates the 
behavior of a dependent variable y to a linear function of the set of independent variables 
Xj, X2, +++ Xm. The Bs are the parameters that specify the nature of the relationship, and ¢ is 
the random error term. The dependent variable y in this model is throughput. Refer to 
Figure 20 on page 49 under the bold face number 12 for the list of Bs used in this model. 


The tool used to produce the multiple linear regression analysis is Statistical 
Analysis System (SAS). The SAS tool is used to assist data analysts in analyzing data using 
regression analysis. Below in Figure 20 is an analysis of data throughput between White 
and Gold in the single processor configuration using the results from tests 1 - 32. Below is 
a description of the output from SAS as explained in [SASI91]. The bold face numbers 
have been added to aid in a description of the output. 

1. The name of the dependent variable is THRUPUT. 

2. The degrees of freedom (DF) associated with the sums of squares (SS). 

3. The Regression SS (called Model SS) is 61279.61308, and the Residual SS 
(called ERROR SS) is 65217.01718. The sum of these two sums of squares is the C TOTAL 
(corrected total) SS = 126496.63026. This illustrates the basic identity in regression 
analysis that TOTAL SS = MODEL SS + ERROR SS. Usually, a good model results in the 
MODEL SS being a large fraction of the C TOTAL SS. 
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4. The corresponding Mean Squares are the Sum of Squares divided by the 
respective DF. The MS for ERROR (MSE) is an unbiased estimate of o. provided the 
model is correctly specified. 

5. The value of the F statistic, 239.470, is the ratio of the MODEL Mean Square 
divided by the ERROR Mean Square. It is used to test the hypothesis that all coefficients 
in the model, except the intercept, are 0. In this case, this hypothesis is: 


Ho: B= B,= B= B,= Bs 
6. The p value (Prob>F) of 0.0001 indicates that some of the 6, are not equal to 0. 


7. Root MSE = 6.04621 is the square root of the ERROR MS and estimates the 


etror standard deviation. 


8. Dep Mean = 30.21891 is simply the average of the values of the variable 
THRUPUT over all observations in the data set. 


9. C.V. = 20.00803 is the coefficient of variation expressed as a percentage. This 
measure of relative variation is the ratio of Root MSE to Dep Mean, multiplied by 100. 


10. R-SQUARE = 0.4844 shows that a large portion of the variation in 
THRUPUT can be explained by variation in the independent variables in the model. 


11. ADJ R-SQ is an alternative R-SQUARE and is an alternative to R-SQUARE 

that is adjusted for the number of parameters in the model according to the formula 
ADJ R-SQ = 1 - (1 - R-SQUARE)((n - 1)/(n- m- 1)) 

where n is the number of observations in the data set and m is the number of 
regression parameters in the model, excluding the intercept. This adjustment is used to 
overcome an objection to R-SQUARE as a measure of goodness of fit of the model. This 
objection stems from the fact that R-SQUARE can be driven to | simply by adding 
superfluous variables to the model with no real improvement in fit. This is not the case with 
ADJ R-SQ, which tends to stabilize to a certain value when an adequate set of variables is 
include in the model. 





Mode: SINGLE PP OCESSOR MODEL 
Dependent Variable: pie tie 


Analysis of Variance 


3 4 
2 Sumof Mean 5 6 


Source DF Squares Square F Value Prob>F 


Model 7  61279.61308 8754.23044 239.470 0.0001 
Error 1784 = 65217.01718  36.55662 
C Total 1791 126496.63026 


7 Root MSE 6.0462 10 R-square 0.4844 
8 DepMean 30.21891 1 AdjR-sq 0.4824 
C.V 


20.00803 


Parameter Estimates 
13 14 15 
12 Parameter Standard T for HO: 16 
Variable Estimate Error Parameter=0) Prob>ITI 


oO 
Tl 


27.673306 0.68625789 40.325 0.0001 
8.620893 0.28565645 30.179 0.0001 
5.140603 0.28565645 17.996 0.0001 

-0.000246 0.00010718 = -2.295 0.0219 
-0.000107 0.00000511 -20.927 0.0001 
0.008507 0.00779192 1.092 0.2751 
0.016060 0.01864409 0.861 0.3891 
0.008069 0.03570706 0.226 0.8212 


INTERCEP 
SINGLE 


le oe oe 





Figure 20: SAS Analysis of Single Processor Transfers 


12. The labels INTERCEP, SINGLE, WHITRAN, NUMBUFF, LENBUFF. 
WINDSIZE, TTRT and THREADS identify the coefficient estimates. The parameter 
SINGLE is used to show if the transfers were just between one workstation at a time, or if 
both White and Gold were transmitting at the same time. The parameter WHITRAN is used 
to show if White is transmitting or if Gold is transmitting. The other parameters were 
previously describ: .: Chapter [V, Test Design Plan. 
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13. The Parameter Estimates give the fitted model 

THRUPUT = 27.673306 + ¥.620893(SINGLE) + 5.140603(WHITRAN) 

- 0.000246(NUMBUFF) - 0.000107(LENBUFF) 
+ 0.008507(WINDSIZE) + 0.016060(TTRT) + 0.008069(THREADS) 

Thus, for example, a window size of 1k contributes 0.008507 to the throughput of 
data if all other parameters are held fixed. If the window size is 45k, then it contributes 
0.382815 if all other parameters are held fixed. 

14. These are the (estimated) standard errors of the parameter estimates and are 
useful for constructing confidence intervals for the parameters. 

15. The ¢ tests (T for HO: Parameter = 0) are used for testing hypotheses about 
individual parameters. The complete model for all of these r tests contains all the variables 
on the right side of the MODEL statement. The reduced model for a particular test contains 
all these variables except the one being tested. Thus, the t statistic = 0.008507(WINDSIZE) 
for testing the hypothesis Ho: f= 0 is actually testing whether the complete model 
containing NUMBUFF, LENBUFF, WINDSIZE, TTRT and THREADS fits better than 
the reduced model containing only NUMBUFF, LENBUFF, TTRT and THREADS. 

16. The p value (Prob > IT!) for this test is p = 0.0001. 

As shown in Figure 20 under item 16, Prob<ITI, the parameters NUMBUFF, 
WINDSIZE, TTRT and THREADS had the least impact on THRUPUT in this model. This 
shows up as the higher the Prob<!TI of the independent variable, the less impact it has on 
the dependent variable being modeled. Included in this model was the system transferring 
the data (WHITRAN) and whether it was a one way transfer or two way transfer (SINGLE). 
Therefore, the tunable parameters are competing with the fact that a 40MHz workstation is 
being compared to a SOMHz workstation and whether or not another station is competing 
for the token to transfer data. 

The end result in this model is that the independent variable SINGLE has the most 
impact on THRUPUT and WHITRAN has the next largest impact on THRUPUT. This 
shows that competition for the token has more impact on throughput than tuning the 
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system. However, there is still a performance gain to be realized with tuning the system for 
better throughput. In Figure 21 is a graphic comparison of the Ist Test with the 29th Test. 
As a reminder, the Ist Test is using the default parameters and the 29th Test is using the 


following parameter settings: t_req = 25ms, nfs_async_threads = 16; sbf_num_Ilc_rx = 48. 
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Figure 21: Single Processor, File D Transfer From White to Gold 


2. Two Processor Results 


The second set of test were run while Gold and White were set up in a two- 
processor configuration running Solaris 2.3. These 48 tests represent a small subset of all 
possible tunable parameter combinations. The primary focus of this set of test was to 
determine the effect of modifying the TCP/IP window size, the nfs_async_threads, t_req. 
sbf_num_llc_rx and the sbf_mtu parameters.The 48 test and the values of the tunable 


parameters are listed in TABLE 71, Appendix E. 
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The primary difference between this set of tests and the single processor test is 
that all wansfers were made from White to Gold. To have also included transfers from Gold 
to White in this set of test would have doubled the number of tansfers to 96 tests. 
Originally it was thought that by increasing the number of parameters being observed the 
R-square value would also have increased. The intention here was to account for more of 


the factors which impact the dependent variable THRUPUT. 


Mode:TWO PROCESSOR MODEL 
Dependent Variable: THRUPUT 


Analysis of Variance 


Sum of Mean 
Source DF Squares Square F Value Prob>F 


Model 7 66901.88212 9557.41173 68.151 0.0001 
Error 2680 375842.31356 140.23967 
C Total 2687 442744.19568 


Root MSE 1.84228 R-square 0.1511 
Dep Mean 40.72729 AdjR-sq 0.1489 
CV. 29.07702 


Parameter Estimates 


Parameter Standard T for HO: 
Variable DF - Estimate Error Parameter=0 


1 -91.980251 12.35679655 -7.444 
1 = -0.000068737 0.00017141 -0.401 
1 = -0.000062619 0.00000817 -7.664 
1 = -0.019754 0.01246095 1.585 
1 = -0.024980 0.02981591 -0.838 
1 -0.034226 0.05710325 -0.599 
] 0.643378 0.03496846 18.399 
1 0.024786 0.00285516 8.681 


Figure 22: SAS Analysis of Two Processor Transfers 
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As shown in Figure 22 on page 52, the R-square value decreased considerably 


between the single processor test and the dual processor test. As it will be shown later on, 
the cause for this decrease was the removal of the largest impact on throughput, competing 
with other stations for the token. Another indicator of the lack of confidence in the data 
being modeled is the large Standard Error for the independent variable INTERCEP. In the 
single processor model INTERCEP had a value of 0.68625789. In the dual processor 
model, the error has increased to 12.35679655. 


The independent variables, NUMBUFF, THREADS and TTRT continued to have 
the least amount of impact on the dependent variable THRUPUT as indicated by their low 
Prob>ITI values. The independent variables with the largest impact were LENBUFF, LLC 
and MTU. 


3. One And Two Processor Results 


In the final analysis of both one and two processor tests, some additional facts 
need to be presented. There were a total of 4,480 throughputs measured in this analysis. 
There were 896 measurements in the one processor configuration and 2688 measurements 
in the two processor configuration. These are averaged measurements taken from the six 
runs in each 32 + 48 = 80 tests. Also, there were 896 measurements where both Gold and 
White were transmitting at the same time and 2688 measurements where only one station 


was transmitting. 


When the model was first run including all the data from the one and two 
processor tests the R-square value was only 0.3559. This was higher than in the two 
processor model but lower than in the one processor model. A scatter plot was made of the 
various parameters to determine where there might be some problems with individual 
parameters. The most obvious problem was seen with the large variation of throughput with 
the parameter window size. At both the high end and the low end, the plot of window size 


versus throughput was not linear. By restricting the analysis of data to window sizes less 
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than 50k and greater than 16k the R-square value increased to 0.6600. This reduced the 


number of measured observations from 4,480 throughputs to 2,240 measured throughputs. 
Mode: ONE & TWO PROCESSOR MODEL 
Dependent Variable: THRUPUT 
Analysis of Variance 


* Mean 


Square 


17995.95851 
41.59176 


Sum of 


Source DF Squares F Value Prob>F 


10 179959.58511 432.681 0.0001 


2229 = 92708.03657 


Model 
Error 


C Total 


Root MSE 


2239 272667.62168 


6.44917 
42.53933 


R-square 
Adj R-sq 


0.6600 
0.6585 


Dep Mean 
C.V. 


15.16048 
Parameter Estimates 


Parameter Standard T for HO: 


Variable DF Estimate 

9.87019489 
0.430903 13 
0.43090313 
0.00010226 
0.00000487 
0.01523473 
0.01778717 
0.03406588 
0.02693145 
0.015782 0.00219894 
9.535964 0.44849820 


Figure 23: SAS Analysis of Single and Two Processor Transfers 


INTERCEP 1 -70.427345 

1 9.928996 

1 3.652165 

1 = -0.000052070 
1 = -0.000047372 
1 = 0.200113 

1 = -0.012831 

1 = -0.039099 

1 0.583336 

] 

] 





The results of the one and two processor analysis are above in Figure 23. One new 
independent variable, SD is used to model whether the transfer comes from the one 


processor tests or the two processor tests. Just as before, the independent variables 
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NUMBUFF, TTRT, and THREADS have the least amount of impact on THRUPUT. With 
the removal of the window s:.zs noted above, WINDSIZE now carries more weight in this 
model. The largest impact on THRUPUT in order of impact is caused by the variables 
SINGLE, SD, LLC and WINDSIZE. This statement will be covered in more detail later. 
This indicates once again that processor power has the largest impact on throughput. A 
graphical model of the difference is below in Figure 24. In this figure are plots of 
throughput from identical parameter configurations, but one is from a two processor run 


and the other is from a one processor run. 
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Figure 24: White Single Processor vrs White Two Processors 


Another useful result which can be determined from the analysis of the one and 
two processor tests is a predicted throughput. Below in Figure 25 are SAS predictions of 
THRUPUT based on the 2,240 measured throughputs used in this analysis. To achieve the 


minimum predicted throughput, the following test was run using the parameter settings 
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indicated in Figure 25. Data was transferred from Gold to White and White to Gold 
simultaneously. The results were taken from Gold with NUMBUFF = 4096, LENBUFF = 
65536, WINDSIZE = 44, TTRT = 25, THREADS = 16, LLC = 40 and MTU = 4192. The 
results are below in TABLE 6. 


The SAS predictions for the minimum predicted throughput was for a rate of 
15.5302 Mbps. As shown in TABLE 6 the results from the actual tests was an average of 
15.1463 Mbps and an mean of 15.0454 “Mbps. Since the data used in the model was 
averaged data instead of mean data, the averaged achieved rate is the more accurate 
throughput rate to use. The SAS predictions for the maximum predicted throughput was for 
a rate of 58.7810 Mbps. As shown in TABLE 6 the results from the actual tests was an 
average of 60.07 Mbps and an mean of 65.5360 Mbps. In both cases the average throughput 
measured was very close to the predicted throughput. This shows that the SAS model was 


very accurate 
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Figure 25: SAS Throughput Prediction 


TABLE 6: RESULTS OF SAS PREDICTIONS 


FLOW | 174763 | 16.3840 | 12.8660] 13.7009 | 20.1699 | 10.2802 | 15.1463 | 15.0054 


SRmOeesmnt a PNT (S| REI (ASN (Peay US, a (ee) 
[HIGH _|_ 32.7680 | 65.5360 { 65.5360 | 65.5360 | 655360 | 65.5360] 60.07 | 655360 | 








The following formula relates the behavior of the dependent variable THRUPUT 
to a linear function of the set of independent variables SINGLE, WHITRAN, NUMBUFF. 
LENBUFF, WINDSIZE, TTRT, THREADS, LLC, MTU and SD. These are the values 
calculated in the One and Two Processor Model, Figure 23 on page 54. 


THRUPUT = -70.427345 + 9.928996(SINGLE) + 3.652165(WHITRAN) 
- 0.000052070(NUMBUFF) - 0.000047372(LENBUFF) 
- 0.200113 (WINDSIZE) - 0.012831(TTRT) - 0.039099(THREADS) 
+ 0.583336(LLC) + 0.015782(MTU) + 9.535964(SD) 


When the minimum and maximum throughput was predicted above in Figure 25 
on page 46, it was simply a matter of inserting the largest parameter value in the above 
formula . the parameter estimate is positive and the smallest parameter value if the 
parameter estimate is negative. This resulted in the maximum predicted throughput. For the 
minimum predicted throughput, the largest parameter value is used if the parameter 
estimate is negative and the smallest parameter value if the parameter estimate is positive. 

Below are the formulas for minimum and maximum throughput with the 
parameter estimates and parameter values multiplied together. 

Maximum Throughput: 

58.7544 = -70.427345 + 9.928996 + 3.652165 - 0.05331968 - 0.38807142 - 4.00226 
- 0.064155 - 0.312792 + 32.666816 + 68.683264 + 19.071928 

Minimum Throughput 


15.5302 = -70.42734 + 0 + 0 - 0.21327872 - 3.1045714 - 8.804972 - 0.320775 
- 0.625584 + 23.33344 + 66.158144 + 9.535964 


Once the minimum and maximum throughputs were computed, the reiative value 
of each parameter was calculated by subtracting the parameter’s minimum value from it’s 
maximum value. Below in Figure 26 are the results from this calculation. The value from 


the maximum calculation is listed, then the value from the minimum value is listed and 
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finally the difference is listed. It is this difference which shows the impact each parameter 
has on the end throughput. The higher the difference is, the more weight that parameter 


Carries in determining the maximum throughput. 


L 
Cc 


Zrryde Ts 


3.65 -—0.05 -0.38 -4.00 -0.06 -0.31 32.66 68.68 19.07 
0 €0.21 -3.10 -880 -032 -0.62 23.33 66.15 9.53 
3.65 0.16 2.72 48 0.26 031 9.33 2.53 9.54 





Figure 26: Relative Importance of Each nttcp Parameter 


The results listed above show that the following parameters, in order of 
importance, have the most impact on throughput using the current model: 


¢ If the data was only being transferred from one workstation to another or if 
both workstations were transferring data to each other simultaneously. 

¢ Whether the workstation had one or two processors 

¢ The number of 4K receive buffers allotted for receiving data. 

¢ The number of TCP/IP windows available for sending data. 


Since the TCP/IP window size was limited in the above model to a range of 20k to 44k, 
this parameter showed up having less of an impact than it really has. As an example, in 
TABLE 72 on page 120 of Appendix E, the throughput rate for File C is 32.77 Mbps for a 
window size of 4k and 58.25 Mbps for a window size of 44k. That means the throughput 
rate at a 4k window size is only 56 percent the rate of the 44k window size. In this case, the 
window size has the largest impact on throughput performance. Unfortunately though, the 
results at the lower and higher window sizes were not consistent in all cases and the data 


was removed from the analysis. In most cases though, the difference in throughput 


58 








performance between a TCP/IP window size of 4k and a window size of greater than 20k 


is more significant than any other factor considered in this investigation. 


Based on the visual inspection of the results from both the one processor tests and 
the two processor tests, below is a revised list in order of importance the parameters having 


the most impact on throughput: 


¢ The number of TCP/IP windows available for sending data. 


¢ If the data was only being transferred from one workstation to another or if 
both workstations were transferring data to each other simultaneously. 


¢ Whether the workstation had one or two processors 
¢ The number of 4K receive buffers allotted for receiving data. 


Another parameter which showed unexpected results is the WHITRAN 
parameter. This parameter is used to track any differences in throughput between 
transmitting data from White to Gold, or from Gold to White. The result in Figure 25 on 
page 56 indicates that transmitting data from White to Gold was faster than transmitting 
data from Gold to White. In the first 32 one processor tests, White had one 40MHz 
processor and Gold had one SOMHz processor. In the second 48 tests, White had two 
40MHz processors and Gold had two 50MHz processors. Based on the Neal Nelson 
Benchmark tests, Gold should be capable of transferring data faster than White. 


Several additional tests were conducted to determine why White was able to 
transmit data at a higher throughput than Gold. First, the FDDI cards were swapped to see 
if the FDDI card in Gold was causing the problem. The results of these tests are in TABLE 
69 on page 117 and TABLE 70 on page 118. There was not any noticeable difference in 
throughput rates with the boards swapped. Next, the two SOMHz processors were placed in 
White and the two 40MHz processors were placed in Gold. The results of these tests are in 
TABLE 121 and TABLE 122 on page 137. As shown in Figure 27, even when both 
transmitting systems had two SOMHz processors and both receiving systems had 40MHz 
processors, White still had a higher throughput rate with File C than Gold. 
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Figure 27: Throughput Comparison Between White and Gold 


The only other difference between White and Gold is that Gold is the server on 
the FDDI network. Since the FDDI network only had three workstations on the network, 
this additional load on Gold should not be that great. 


C. REMOTE COPY PROTOCOL TRANSFERS 


Initially, the plan was to conduct file transfers using the rcp system call varying the 
tunable parameters just as in the nttcp tests. However, it was quickly observed that there 
were not any noticeable differences in measured throughput at the different parameter 
settings. This was understandable with the parameters njfs_async_threads and t_req. The 
SAS model showed that these tunable parameters had little effect on throughput. However, 
it was expected that there would be some different throughput rates with the TCP/IP 


window size, lic and mtu parameters varied. 














The reason why the these parameters did not have an impact was that rcp does small 
size read()’s and write()’s, so the syscall overhead dominates over the time spent in the 
kernel in TCP. If an application wants optimum bulk data throughput, it should increase the 
receive buffering, and also do moderately large read()’s and write()’s so that the syscall 
overhead does not dominate. Also, rcp has to go through a complete login, exec of the 
user’s shell, and run through the user’s “.eshre” or “.profile” on the server side before it 
begins transferring any data. If the data transfer is not really huge, the time spent logging 


in will be much greater than the time spent transferring the data. 


Knowing that the largest impact on throughput based on the SAS modeled data is TCP/ 
IP window size, processor power and whether or not another station is also transmitting, 
four different transfer tests were conducted with each of the four file sizes. As shown below 
in TABLE 7 and TABLE 8 on page 62, tests were conducted in the one processor 
configuration and the two processor configuration while transferring files one-way and 


two-way (between White and Goid simultaneously). 


TABLE 7: RCP ONE PROCESSOR TRANSFER RESULTS 


| TIN} | EDIUM | LAR | UGE 
(6 bytes (48,072 bytes) {j (1,314,923 bytes) |j(17,989,936 bytes 
ONE-WAY 
ANE to ou 


RE BeROL —————[ oats | Sime] Ss 


TWO.WAY TRANSFERS 
White to Goad & Gold to White 


"DEVINULL 16.72 Mbps 



















Also, files were transferred from disk to disk and from disk to /dev/null. This second 
transfer method does not result in a disk write at the destination workstation. The device 
driver, /dev/null, is used to dispose of files without needing to delete them. Files can be sent 


to /dev/null and this device driver accepts the data without writing them to disk. 
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The largest impact seen in this set of tests was the file size. The lowest throughput rate 
was observed when wtansferring the smallest file, TINY. This file has an associated 
overhead of 90.9% when being transferred over FDDI. The highest throughput was seen 
with the file HUGE. This file only had an overhead of 1.37% when transferred over FDDI. 
These overhead figures include the overhead associated with the FDDI, IP and TCP 
protocols. Another area with similar results as the nticp test is whether the transfers are one- 
way or two-way. When the two workstations have to compete for the token the throughput 


drops. 


TABLE 8: RCP TWO PROCESSOR TRANSFER RESULTS 


aon D nil LARGE HUGE 
(1,314,923 bytes) |](17.989,936 bytes 
TRANSFER 
Whee to Gold 


TO: /FILENAME | 000081 Mops | _25 Migs | 494 Mbps | 1354 Mpa 
uel a SL a 


a TRANSFER 
Gold RENE White 


[0005 Migs | 24 Migs | 526 Mops | 
TO DEVRULL———————somas ge | — pe | st no [a 


TWO-WAY TRANSFERS 
White to a & Gold to White 


[REE ips [Nips |e Ws [TS 
TO: (DEVINULL | 000080 Mops | 24 Mips | 3.55 Mops | 23.18 Mops 


















The results during the rcp tests were much lower than during the nttcp tests. As an 
example, on the transfer of a file size of over 17 Mbytes from Gold with two processors to 
White:/dev/null, the best achieved throughput rate was 29.82 Mbps with rcp. This is only 
29.82 percent of FDDI’s available bandwidth and only 43.7 percent of the highest achieved 
throughput using nttcp (65Mbps). When transferring the same file from Gold to White and 
writing the file to disk, the transfer rate was 21.66 Mbps. This rate is only 72 percent of the 
transfer rate of transferring the data to /dev/null. Below in Figure 28 on page 63 is a 


62 








graphical plot of the transfer rates just mentioned while transferring the 17.9 Mbyte file 
from Gold with two SOMHz processors to White with two 40 MHz processors. 

There were two main differences between the transfer methods: First. the rcp transfers 
add another layer of protocols to the transfers. The rcp protocol hands off the data to be 
transferred to the TCP/IP protocol layers. This of course increases the amount of overhead 
transferred. Second, using rcp to transfer the data involves reading the data from disk 
before it can be transferred. Even though large amounts of data can be cached in the 
SuperCache 1-Mbyte extemal cache, this is not large enough for extremely large files being 
transferred to be completely cached. During this test files were transferred 9 times and then 


the median throughput rate was used for the results. 
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Figure 28: RCP File Transfers From Gold To White 





The results from the rcp tests were pretty much as expected. The two processor 


transfers were faster than the single processor transfers and the one-way transfers were 
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faster than the two-way tansfers. However, the difference in these throughput rates was not 


as large as that seen with the natcp tests. Since the additional overhead from the rcp system 
call should affect the transfer rates evenly, then the only other difference is that the data 
was transferred from disk instead of being generated by the CPU. The large difference in 
throughput rates achieved between the two test methods would indicate that the disk access 


is a very large bottle neck in throughput performance. 


A quick comparison of the throughput rate observed using nftcp for a file size of 
16,777,216 bytes (File C) and a rcp transfer of a file size 17,989,936 bytes shows a 
throughput rate of 32.77 Mbps for the nttcp transfer and a throughput rate of 28.42 Mbps 
when transferred to /dev/null. Both of these tests were one-way tests from White to Gold 
with both systems in the two processor configuration. In this comparison, the rcp tests had 
a throughput rate which is 86.7 percent of the nttcp throughput rate. This seems to indicate 
that the retrieval of the file from disk and the overhead of the rcp protocol are responsible 


for 13.7 percent of the slow down in throughput when transferring files. 


When comparing the transfer rate of an rcp transfer from White to a file location on 
Gold with the nttcp throughput rate, there is a much larger difference in throughput. The 
nttcp throughput rate is still 32.77 Mbps and the throughput rate for the rcp file to file 
transfer is 13.54 Mbps. Here the rcp throughput rate is 41.3 percent of the nttcp throughput 
rate. This means that the time to receive and process the file at the destination workstation 
accounts for 45 percent of the reduced throughput. This is the 58.7 percent reduction minus 
the 13.7 percent attributed to the retrieval of the file from disk and the overhead of the rcp 


protocol. 


D. ANALYSIS SUMMARY 


The results from the Neal Nelson Benchmark showed that the systems being 
investigated were functioning as expected. The 50MHz system outperformed the 40MHz 
system and the two processor system outperformed the one processor system. One 
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unexpected result was that SunOS 4.1.3 slightly outperformed Solaris 2.3 in just about 


every test except disk access to unix files. Solaris 2.3 was the clear winner in this area. 


The nttcp results were analyzed using a linear multiple regression analysis model. 
Even though the throughput results were not linear, the model is believed to be accurate 
enough to show the relationship between the parameters being investigated. The analysis 


of this data provides the most concrete results of the two throughput tests methods. 


The number of workstations on an FDDI network transmitting has the largest impact 
on throughput among the parameters investigated according to the one processor and two 
processor models. An example of this impact is to take the SAS prediction shown in Figure 
25 on page 56 and change the parameter SINGLE from its one-way value to the two-way 
value. This allows SAS to predict a new throughput rate based on all the previous values 
except the change just noted. The result of the new prediction shows a new throughput 
prediction of 48.8254 Mbps. This is only 83.1 percent of the original : throughput 
predication of 58.7544 Mbps. 


The power of the workstation itself is a major factor in throughput potential. This is 
seen in the fact that the second largest impact on throughput in the one processor and two 
processor model is whether or not the workstation had two processors. The result of the 
new one processor prediction shows a throughput predication of 49.2184 Mbps. This is 
83.7 percent of the original throughput predication of 58.7544 Mbps. 


Since the TCP/IP window size was limited in the model to a range of 20k to 44k, this 
parameter showed up having less of an impact than it really has. In most cases though, the 
difference in throughput performance between a TCP/IP window size of 4k and a window 
size of greater than 20k is more significant than any other factor considered in this 
investigation. 

The results from the rcp tests are more of an observation of the effects of the disk drive 
on throughput performance. Since both tests measure the time from start of test to receiving 


the ack from TCP on the receiving workstation that the data has been received, the only 
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other real differences is the rcp protocol and the fact that the data is being wansferred as 
files instead of being generated by the processor. 

As pointed out earlier, the overhead of the rcp protocol and the time spent retrieving 
the file from disk is approximately 13.7 percent of the throughput rate observed during the 
nttcp throughput tests. Additionally, the overhead of processing the file at the receiving 
workstation is approximately 45 percent of the throughput rate observed during the nrcp 
throughput tests. : . 

The observation made in the nttcp tests that white with only 40MHz processors could 
transfer data faster than Gold with 50MHz processors was not seen again in the rcp tests. 
In the rcp tests, Gold was able to transfer data at a higher throughput rate than White when 
Gold had the two SOMHz processors and White had the two 40MHz processors. 














VI. CONCLUSIONS AND TOPICS FOR FUTURE RESEARCH 


A. CONCLUSION 

The objective of this research was to measure actual throughput between high 
performance workstations over an FDDI network to determine what bottlenecks, if any, 
exits between Sun Microsystem SPARC 10 multiprocessors running the Solaris 2.3 and 
Network Peripheral Inc.’s (NPI) FDDI network interface cards and to evaluate 
Transmission Control Protocol/Internet Protocol (TCP/IP) as a high speed transport 
protocol. 

At the beginning of this investigation there were many speculations as to what 
throughput rates could be achieved and what effect varying the different tunable parameters 
would have on the throughput rates. It was assumed that the workstation with the SOMHz 
processor would have a faster throughput rate than the workstation with the 40MHz 
processors. It was also assumed that since Sun Microsystems was encouraging their users 
to switch from SunOS to Solaris, that Solaris 2.3 would clearly out perform SunOS 4.1.3. 


The following sections outline the conclusions drawn from these investigations: 


1. Workstation Conclusions 
There were four benckmerk tests conducted using the Neal Nelson Business 
Benchmark run on the two workstations, Gold and White. 


¢ Gold had two 50MHz processors installed and was running Solaris 2.3. 
¢ Gold had one 50MHz processor installed and was running Solaris 2.3. 

¢ Gold had one SOMHz processor installed and was running SunOS 4.1.3. 
¢ White had two 40MHz processors installed and was running Solaris 2.3. 


Three test comparisons were conducted by Neal Nelson and Associates and the 
resuits can be summarized as follows: 


¢ A workstation running Solaris 2.3 with two SOMHz processors can be expected 
to outperform a workstation running Solaris 2.3 with two 40MHz processors 
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in most areas of performance by approximately 2U percent. 


¢ A workstation running Solaris 2.3 with two SUMHz processors can be expected 
to outperform a workstation running Solaris 2.3 with one S5UMHz processor in 
most areas of performance by approximately 90 percent. 


¢ A workstation running SunOS 4.1.3 with one SOMHz processor can be 
expected to outperform a workstation running Solaris 2.3 with one SOMHz 
processor in most areas of pesformance by approximately 2 percent. 


Of the three comparisons noted above, the first two results were expected. 
However, it was assumed that Sun Microsystem’s release of Solaris 2.3 would result in 
improved operating system performance, not a slight drop in performance. These results 
were very important in the next step of the investigation. Knowing that the workstation with 
two 5QMHz processors should outperform the workstation with two 40MHz processors 


helped isolate some unexpected results in workstation throughput. 


2. Throughput Conclusions 


There were two methods used in this investigation to measure throughput. First, 
a public domain network throughput measurement tool, New Test TCP (nttcp), was used 
in order to minimize the workstation overnead. Next, the Remote Copy Protocol (rcp) 
system call was used in order to include all the overhead of daily distributed processing. 
The results obtained from these two test methods were consistent with each other. 

New Test TCP (nitcp): During the nttcp tests the following tunable parameters 
were varied to determine their impact on throughput performance: 


¢ TCP/IP window size, the amount of data that can be in transient at any one time 
between workstations. 

¢ sbf_num_lic_rx, number of receive buffers (4k each) on the FDDI board 
allotted for receiving data. 

¢ nfs_async_threads, number of asynchronous threads allotted for handling 
network file system service. 

¢ sbf_treq, amount of time allotted for each workstation to transfer data prior to 
passing on the token. This is the TTRT. 

¢ sbf_mtu, maximum protocol packet size. 
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Additionally, the nttcp tests were run on both single processor configurations and 
on two processor configurations. During this investigation the nttcp tests results showed 
that the four most significant impacts on throughput and the order of impact were as 
follows: 


« Whether data was being transferred one-way or if both workstations were 
transferring data simultaneously. 

¢ Whether the workstation had one or two processors 

¢ The number of 4K receive buffers allotted for receiving data. 

¢ The size of TCP/IP window available for sending data. 


One note about the TCP/IP window size. During this investigation TCP/IP 
window sizes less than 20k and greater than 44k had too large of a deviation in their 
throughput results to be included in the final analysis. When the all of the TCP/IP window 
sizes are included, this parameter ends up having the largest impact on throughput rates. 
The rest of the results retain the above order of impact on throughput. 

The other tunable parameters varied during these tests had little impact on 
throughput performance. Below are the rest of the factors affecting throughput in their 
order of importance: 


¢ The length of the buffers being transmittUc. i‘his equates to the size of the data 
being transmitted. 

¢ The Maximum Transmission Unit (MTU) size. This is the size of the FDDI 
frames of data being transmitted. 

¢ The number of NFS asynchronous threads allowed for servicing network file 
service. 

¢ The number of buffers (file size) being transmitted. 


Remote Copy Protocol (rcp): During the rcp tests the tunable parameters were 
varied, but there was no noticeable difference in these throughput rates. The TCP/IP 
window size, which had the largest impact in the nttcp tests, did not have any noticable 
impact on throughput. The reason why the TCP/IP window did not have an impact was that 
rcp does small size read()’s and write()’s, so the all overhead dominates over the time 


spent in the kernel in TCP. If an application want.. sptimum bulk data throughput, it should 
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increase the recieve buffering, and also do moderately large read()’s and write()’s so that 


the syscall overhead does not dominate. 


The only difference between the nttcp tests and the rcp tests was the additional 
overhead with the rcp disk transfers and the rcp protocol overhead. Therefore, the 
conclusion can be drawn that one of these two differences accounted for the very large drop 


in throughput between the nttcp tests and the rcp tests. 


On the transfer of a file size of over 17 Mbytes from White with two processors 
to Gold, the best achieved throughput rate was 13.54 Mbps with rcp when the transferred 
data is written to disk. This is only 13.54 percent of FDDI’s available bandwidth and only 
41.3 percent of the highest achieved throughput using nitcp at the same TCP/IP window 
size of 8k. Most of this 41.3 percent difference between rcp and nttcp can be attributed to 
the rcp protocol overhead. RCP has to go through a complete login, exec of the user’s shell, 
and run through the user’s “.cshrc” or “.profile” on the server side before it begins 
transfering any data. If the data transfer is not really huge, the time spent logging in will 
be much greater than the time spent transfering the data 


B. TOPICS FOR FUTURE RESEARCH 


Several topics for further study can be derived from this investigation. All of them are 
related to either improving throughput or to explaining events which were not explained in 


this thesis. 


Since the nttcp tests were only able to obtain a maximum throughput using TCP 
transfers of 65 Mbps, 35 percent of the available bandwidth of FDDI is not being used. 
What portion of this unused bandwidth is due to lack of processor power and what portion 


is due to inefficiencies in the TCP/IP protocol? 


This investigation primarily looked at throughput rates associated with TCP transfers, 
not User Datagram Protocol (UDP) transfers. The UDP frames have a header of 8 bytes and 
the TCP frames have a header of 20 bytes. Also, UDP is not a reliable transport protocol. 
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How much of a throughput can be achieved using UDP and what problems occur when 
using an unreliable transfer protocol? 

File transfers using the rcp system call displayed a throughput rate of only 13.54 Mbps 
when the transferred data is written to disk. What percentage of this bottleneck is caused 
by the throughput rate on the SCSI-2 controller and what percentage is caused by other 


overhead associated with file transfers? 


71 








































#!/oin/sh 





date > start 
date > run]_start_time 


ttest.sh 65536 512 
ttest.sh 8192 512 
ttest.sh 65536 1024 
ttest.sh 8192 1024 
ttest.sh 65536 2048 
ttest.sh 8192 2048 
ttest.sh 65536 4096 
ttest.sh 8192 4096 


date > run1_finish_time 


mkdir run1 
mv *.log *.out run1/. 
mv “time runi/. 


date > run2_start_time 


ttest.sh 65536 512 
ttest.sh 8192 51 
ttest.sh 65536 1024 
ttest.sh 8192 1024 
ttest.sh 65536 2048 
ttest.sh 8192 2048 
ttest.sh 65536 4096 
ttest.sh 8192 4096 


date > run2_finish_time 


mkdir run2 
mv *.log *.out run2/. 
mv “time run2/. 


date > run3_start_time 


ttest.sh 65536 512 
ttest.sh 8192 512 
ttest.sh 65536 1024 
ttest.sh 8192 1024 


APPENDIX A: NTTCP PROGRAM and TEST SCRIPTS 


DOIT.SH Script 


ttest.sh 65536 2048 


Gestsh 8192 2048 
ttest.sh 65536 4096 
ttest.sh 8192 4096 


date > run3_finish_ume 
mkdir run3 

mv *.log *.out run3/. 
mv *time mun3/. 


date > run4_start_time 


tiest.sh 65536 512 
ttest.sh 8192 512 
ttestsh 65536 1024 
ttest.sh 8192 1024 
ttest.sh 65536 2048 
ttest.sh 8192 2048 
ttest.sh 65536 4096 
ttest.sh 8192 4096 


date > run4_finish_time 


mkdir run4 
mv *.log *.out run4/. 
mv *time run4/. 


date > run5_start_time 


ttest.sh 65536 512 
ttest.sh 8192 512 
ttest.sh 65536 1024 
ttest.sh 8192 1024 
ttest.sh 65536 2048 
ttest.sh 8192 2048 
ttest.sh 65536 4096 
trest.sh 8192 4096 


date > run5_finish_time 
mkdir runS 


mv * log *.out run5/. 
mv *time run5/. 





date > run6_start_time 


ttest.sh 65536 512 

ttest.sh 8192 512 

ttest.sh 65536 1024 

ttest.sh 8192 1024 

ttest.sh 65536 2048 

ttest.sh 8192 2048 

ttest.sh 65536 4096 

ttest.sh 8192 4096 . 


date > run6_finish_time 


mkdir run6 
mv *.log *.out run6/. 
mv “time run6/. 


date > finish 


TTEST.SH Script 


#!/bin/sh 


# 


# Use nttcp to test network throughput. 
# Usage: ttest.sh byte_per_write 


# 


fNtumber_of_writes 


DATALENS$1 
NPKTS=$2 


#White to Gold 
RECHOST=131.120.1.2 
RSH=/usr/ucb/rsh 
NTTCP=nttcp 


tm -f ttest.out 
rm -f ttest.tran.log 
rm -f ttest.recv.log 


# from 4KB to 60KB windows in steps of 8KB 
SIZE=4 
while test $SIZE -It 61 


do 


SRSH SRECHOST SNTTCP -r -w$SIZE 
>tmp1 2>&1 & 
Sleep 5 
SNTTCP -t -ISDATALEN -n$NPKTS -wSSIZE 
SRECHOST >> trest.ran.log 2>&1 





sleep 5 
grep ‘Mb/s’ np! | 
‘SSIZE'* 1024,$12}' >> ttest.out 
cat tmp! >> ttest.recv.log 
SIZE='expr SSIZE + & 
done 


awk = ‘(print 


rm -f empl 
Mv ttest.out ttes. SDATALEN.SNPKTS.out 
Mv ttest.tran.log 

ties. SDATALEN.SNPKTS.tran.log 
mv ttest.recv.log 

ttest SDATALEN.SNPKTS .recv.log 











NTTCP Program 


iad 

bg NTTCP.C 

= 

* Test TCP connection. Makes a connection on port 2000 
* and tansfers zero buffers or data copied from stdin. 

* 


* Usable on 4.2, 4.3, and 4.1a systems by defining one of 
* BSD42 BSD43 (BSD41a) 
e e 
* Modified for operation under 4.2BSD, 18 Dec 84 

* — T.C. Slattery, USNA 

* Minor improvements, Mike Muuss and Terry Slattery. 16-Oct-85. 
* 


* Modified on 5 Apr 94 for opertion under Solaris 2.3 based on changes 
* for the TTCP.C program provided by Don Merritt of ARL. 
* CPT Mark Schiviey, USA 
*/ 
#ifndef lint 
static char RCSid(] = "@(#)SHeader: /src/opy/bei/sbin/ucp/RCSftcp.c.v 1.2 1993/11/30 20:15:39 
root Exp $ (BRL)"; 
#endif 
#define BSD43 
/* #define BSD42 */ 
/* #define BSD41a */ 
include <stdio.h> 
#include <ctype.h> 
#include <ermo.h> 
d#include <sys/types.h> 
#include <sys/socket.h> 
#include <netinet/in.h> 
#include <netdb.h> 
é#include <sys/time.h> /* struct timeval */ 
#ifdef SYSV 
#include <sys/times.h> 
#include <sys/param.h> 
#else 
#include <sys/resource.h> 
#endif 
#ifdef SYSV 
#define bcopy(s,d,l) memcpy(d, s, (size_t) 1) 
#define bzero(s,]) memset(s, 0, (size_t) 1) 
#endif 
struct sockaddr_in sinme; 
struct sockaddr_in sinhim; 
struct sockaddr_in sindum; 
struct sockaddr_in frominet, 
int domain, fromlen; 


74 





int fd: 


int sendwin = 32 * 1024; 


int revwin = 32 * 1024; 
int optien = sizeof(int); 
int buflen = 1024; 

char *buf; 

int nbuf = 1024; 

int udp = 0; 

int options = 0); 

int one = 1; 

short port = 2001; 

char *host, 

int trans; 

int sinkmode = 1; 

int verbose = 0; 

int nodelay = 0; 

int window = 0: 

struct hostent “addr; 
exter int errno; 

char Usage[] = \ 


/* fd of network socket */ 


/* length of buffer */ 

/* ptr w dynamic buffer */ 
/* number of buffers to send in sinkmode */ 
/* O= tcp, !0 = udp */ 

/* socket options */ 

/* for 4.3 BSD style setsockopt() */ 

/* TCP pert number */ 

/* ptr to name of host */ 

/* O=receive, !O=transmit mode */ 

/* O=znorma! I/O, !0=sink/source mode */ 


/* set TCP_NODELAY socket option */ 
/* Oscuse default l=set to specified size*/ 


Usage: ttcp -t [-options] host <in\n\ 
-HHt =~ length of bufs written to network (default 1024)\n\ 
-S don't source a pattem to network, use stdin\n\ 
-n## = =~—s number of bufs written to network (-s only, default 1024)\n\ 
-p## = —_ port number to send to (default 2000)\n\ 
u use UDP instead of TCP\n\ 


Usage: ticp -r [-options] >out\n\ 


-## ~—s length of network read buf (default 1024)\n\ 

“s sink (discard) all data from network\n\ 

-p## ~—s port number to listen at (default 2000)\n\ 

-B Only output full blocks. as specified in -H## (for TAR)\n\ 
-u use UDP instead of TCP\n\ 


char stats[128); 
double t, 

long nbytes; 

int b_flag = 0; 

void prep_timer(): 
double read_timer(): 
double cput, realt; 
main(argc,argv) 

int argc; 

char **argv; 


{ 


unsigned long addr_tmp: 
if (argc < 2) goto usage: 


argv++; argc--; 


/* transmission time */ 


‘/* bytes on net */ 


/* use mread() */ 


/* user, real time (seconds) */ 


while( argc>0 & & argv[0){0} =='-') { 


switch (argv({0}[1]) { 
case 'B': 


b_flag = 1; 
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break: 
cause 't': 
wans = 1; 
break: 
case 'r': 
trans = 0; 
break; 
case ‘d’: 
options = SO_DEBUG: 
break: 
case ‘Nn’: 
nbuf = atoi(&argv[0)(2)): . 
break; 


case T: 
buflen = atoi(&argv{0)[2)); 
break; 


case ‘w’: 
window=1; 
sendwin = 1024 * atoi(&argv{0)(2]); 
revwin = 1024 * atoi(&argv[0)[2)): 
break; 

case ‘s’: 
sinkmode = 1;/* source or sink, really */ 
break; 


case ‘p’: 
port = atoi(&argv(0)}[2)); 
break; 


case ‘u': 
udp = 1; 
break; 
default: 
goto usage; 


argv++; argc--; 


} 
if(trans) { 
* xmitr */ 
if (argc != 1) goto usage; 
bzero((char *)&sinhim, sizeof(sinhim)); 
host = argv[0}; 
if (atoi(host) > 0) { 
/* Numeric */ 
sinhim.sin_family = AF_INET; 
#ifdef cray 
addr_tmp = inet_addr(host); 
sinhim.sin_addr = addr_tmp; 
#else 
sinhim.sin_addr.s_addr = inet_addr(host): 
#endif 
} else { 
if ((addr=gethostbyname(host)) == NULL) 
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em(“bad hosmame"): 
sinhim.sin_family = addr->h_addrtype: 
bcopy(addr->h_addr,(char*)&addr_tmp. addr->h_length); * 
#ifdef cray 
sinhim.sin_addr = addr_tmp: 
#else 
sinhim.sin_addr.s_addr = addr_tunp: 
#endif cray 
! 
sinhim.sin_port = htons(port); 
sinme.sin_port = 0;/* free choice */ ss 
} else { 
[* revr */ 
sinme.sin_port = htons(port); 
} 
if( (buf = (char *)malloc(buflen)) == (char *)NULL) 
err("malloc"); 
fprintf(stderr,"ucp%s: nbuf=%d, buflen=%d. port=%d\n", 
trans?"-t":"-r", 
nbuf, buflen, port): 
if ((fd = socket(AF_INET, udp?SOCK_DGRAM:SOCK_STREAM. 0)) < 0) 
etr("socket”); 
mes("socket"): 
/* Try the getsockopt & setsockopt for Solaris here */ 
#ifndef SOLARIS 
if (bind(fd, &sinme, sizeof(sinme)) < 0) 
err("bind"); 
Helse 
f* 
* Under Solaris. calling connect() on a stream socket binds the 
* socket to an address. If a bind() is done before the connectQ, 
* an error "connect: Address family not supported by protocol family" 
* results. Only call bind() for the cases where you're not going 
* to call connect(). 
*/ 
if (udp Il (tudp && !trans) ) 
if (bind(fd. (struct sockaddr *) &sinme, sizeof(sinme)) < 0) 
err(“bind”); 
#endif /* SOLARIS */ 
if (!udp) { 
if (trans) { 
/* We are the client if transmitting */ 
if(options) { 
#ifdef BSD42 
if( setsockopt(fd. SOL_SOCKET. options, 0. 0) < 0) 
#else BSD43 
#ifndef SOLARIS ; 
if( setsockopt(fd. SOL_SOCKET. options, &one. sizeof(one)) < 0) 
#else 
if( setsockopt(fd, SOL_SOCKET, options, (char *) &one. sizeof(one)) < 
0) 
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#endit /* SOLARIS */ 
#endif 
err("“setsockopt"); 
} 
#ifndef SOLARIS 
if(connect(fd, &sinhim, sizeof(sinhim) ) < 0) { 
#else 
if(connect(fd, (struct sockaddr *) &sinhim., sizeof(sinhim) ) < 0) { 
#endif * SOLARIS */ 
err(“connect”); 
} 
mes(“connect”); . 
if(window){ 
if (setsockopt (fd, SOL_SOCKET, SO_SNDBUF, (char *) &sendwin, 
sizeof(sendwin)) < 0 ) 
printf("get send window size didn't work\n"): 
if (setsockopt (fd, SOL_SOCKET, SO_RCVBUF. (char *) &rcvwin, 
sizeof(rcvwin)) < 0 ) 3 
printf(“get rev window size didn't work\n"); 
if (getsockopt (fd, SOL_SOCKET, SO_RCVBUF, (char *) &sendwin, &optien) < 0 ) 
printf("get send window size didn't work\n"); 
else printf("send window size = %d\n", sendwin); 
if (getsockopt (fd, SOL_SOCKET, SO_RCVBUF, (char *) &rcvwin, &optien) < 0) 
printf("get rcv window size didn't work\n"); 
else printf("receive window size = %d\n", rcvwin); 
} 
} else { 
/* otherwise, we are the server and 
* should listen for the connections 
bd | 
#ifndef SOLARIS 
listen(fd,0); /* allow a queue of 0 */ 


lad 
* Under Solaris, specifying a queue length of 0 
* results in a “connection refused”. 
*/ 
listen(fd,1); 
#endif /* SOLARIS */ 
if(options) { 
#ifdef BSD42 
if( setsockopt(fd, SOL_SOCKET, options, 0, 0) < 0) 
#else BSD43 
#ifndef SOLARIS 
if( setsockopt(fd, SOL_SOCKET, options, &one, sizeof(one)) < 0) 


if( setsockopt(fd, SOL_SOCKET, options, (char *) &one, sizeof(one)) < 
0) 
#endif /* SOLARIS */ 


#endif 
emr(“setsockopt"); 
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} 
fromlen = sizeof(frominet): 
domain = AF_INET: 
#ifndef SOLARIS 
if((fd=accept(fd, &frominet, &fromlen) ) < 0) 
#else 
if((fd=accept(fd, (struct sockaddr *) &frominet, &fromlen) ) < 0) 
#endif /* SOLARIS */ 
err(“accept"); 
mes("accept"); 
if (window){ ‘ 
if (setsockopt (fd, SOL_SOCKET, SO_SNDBUF, (char *) &sendwin, 
sizeof(sendwin)) < 0 ) 
printf("get send window size didn't work\n"): 
if (setsockopt (fd, SOL_SOCKET, SO_RCVBUF, (char *) &rcvwin, 
sizeof(rcvwin)) <0) 
printf("get rcv window size didn't work\n"); 
if (getsockopt (fd, SOL_SOCKET, SO_RCVBUF, (char *) &sendwin, &optlen) < 0 ) 
printf(“get send window size didn't work\n"); 
else printf("send window size = %d\n", sendwin); 
if (getsockopt (fd, SOL_SOCKET, SO_RCVBUF, (char *) &rcvwin, &optlen) < 0 ) 
printf(“get rev window size didn't work\n"); 
else printf("receive window size = %d\n", rcvwin): 
} 
} 
} 
prep_timer(); 
ermo = 0; 
if (sinkmode) { 
Tegister int cnt: 
if (trans) { 
pattern( buf, buflen ); 
if(udp) (void)Nwrite( fd, buf, 4 ); / revr start */ 
while (nbuf-- && Nwrite(fd.buf.buflen) == buflen) 
nbytes += buflen; 
if(udp) (void)Nwrite( fd, buf, 4 ); /* revr end */ 
} else { 
while ((cnt=Nread(fd.buf,buflen)) > 0) { 
Static int going = 0; 
if( cnt <=4) { 
if( going ) 
break:/* "EOF" */ 
going = 1; 
prep_timer(): 
} else 
nbytes += cnt: 
} 
} 
} else { 
register int cnt; 
if (trans) { 
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whuic((cnt=read(U.buf .buflen)) > 0 && 
Nwrite(fd.buf.cnt) == cnt) 
nbytes += cnt: 
} else | 
while((cnt=Nread(fd.buf.buflen)) > 0 && 
write(1.buf.cnt) == cnt) 
nbytes += cnt; 
} 
} 
if(ermo) err("JO"): 
(void)read_timer(stats,sizeof(stats)): 
if(udp&&wans) { * 
(void)Nwrite( fd, buf, 4 );  revr end */ 
(void)Nwrite( fd. buf, 4 );  revr end */ 
(void)Nwrite( fd, buf, 4 );  revr end */ 
(void)Nwrite( fd. buf. 4);  rcvr end */ 
} 
fprintf(stdout, 
“ticp%s: Mid bytes in %.2f real seconds = %.2f KB/sec = %.4f Mb/s\n". 
trans?"-t":"-r", 
nbytes, realt, ((double)nbytes)/realt/1024, 
((double)nbytes)/realt/128000 ); 
if (verbose) { 
fprintf(stdout, 
“ttcp%s: Mid bytes in %.2f CPU seconds = %.2f KB/cpu sec\n", 
trans?"-t":"-r", 
nbytes, cput, ((double)nbytes)/cput/1024 ); 


usage: 


fprintf(stderr.Usage); 
exit(1); 


err(s) 
char *s; 


{ 


} 


fprintf(stderr,"ucp%s: “, trans?"-t":"-r"); 
perror(s); 
fprintf(stderr,"errmo=%d\i" ermo); 
exit(1); 


mes(s) 
char *s; 


fprintf(stderr,"acp%s: %s\n", trans?"-t":"-1", s); 


pattemn( cp, cnt ) 
register char *cp; 
register int cnt, 


{ 


register char c; 











c=; 
while( cnt-->0) { 
while( !isprint((c&Ox7F)) ) c++: 
*cp++ = (c++&0x7F): 
} 
} 
foresees timing Sbeseeeee/ 
#ifdef SYSV 
extern long time(): 
#if spi 
static void tvsub(): . 
Static structtimeval time0;/* Time at whict. timeing started */ 
#else 
static long time0; 
#endif 
static struct tms tms0; 
#else 
Static structtimeval time0;/* Time at which timeing started */ 
static structrusage ru0;/* Resource utilization at the start */ 
static void prusage(): 
static void tvadd(): 
Static void tvsub(): 
Static void psecs(): 
#endif 
‘id 
. PREP_TIMER 
*/ 
void 
ican 
Hifdef SYSV 
#if sgi 
gettimeofday(&time0, (struct timezone *)0); 
Helse 
(void)time(&time0): 
#endif 
(void)times(&tms0): 
Helse 
gettimeofday(&time0, (struct timezone *)0); 
getrusage(RUSAGE_SELF, &ru0); 


} 
i 
* READ_TIMER 
* 

*/ 
double 
read_timer(str.len) 
char *str: 


{ 
#ifdef SYSV 
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long now; 
siruct ims unsnow: 
char line{132): 
#ifdef sgi 
struct timeval timedol: 
struct timeval td; 
getameofday(&timedol, (struct timezone *)0): 
tvsub( &td. &timedol. d&timed ): 
realt = td.tv_sec + ((double)td.tv_usec) / 1000000; 
#else 
(void)ime(&now); 
realt = now-timel: . 
#endif 
(void)times(&tmsnow); 
cput = tmsnow.ims_utime - tmsQ.0ns_utime; 
cput /= HZ; 
if( cput < 0.00001 ) cput = 0.01; 
if( realt < 0.00001 ) realt = cput; 
sprintf(line,"%g CPU secs in %g elapsed secs (%g%%)", 
cput, realt, 
cput/realt* 100 ); 
(void)stmepy( str, line, len ); 
return( cput ); 
#else 
/* BSD */ 
struct timeval timedol: 
struct rusage rul; 
struct timeval td; 
struct timeval tend, tstart; 
char line[132]; 
getrusage(RUSAGE_SELF, &ru1); 
gettimeofday(&timedol, (struct timezone *)0); 
prusage(&ru0, &rul, &timedol, &time0, line); 
(void)stmcpy( str, line, ben ); 
/* Get real time */ 
tvsub( &td, &timedol, &timed ); 
realt = td.tv_sec + ((double)td.tv_usec) / 1000000; 
/* Get CPU time (user+sys) */ 
tvadd( &tend, &ral ru_utime, &rul .ru_stime ); 
tvadd( &tstart, &ru0.ru_utime, &ru0.ru_stime ); 
tvsub( &td, &tend, &tstart ); 
cput = td.tv_sec + ((double)td.tv_usec) / 1000000; 
if( cput < 0.00001 ) cput = 0.00001; 
Teturn( cput ); 


} 

#ifndef SYSV 

Static void 

prusage(r0, r1, e, b, outp) 
Tegister struct rusage *rO), *r1; 
Struct timeval *e, *b; 





char *outp; 


struct timeval tdiff: 
register time_t t; 
register char *cp; 
register int i; 
int ms; 
t= (rl->ru_utime.tv_sec-r0->ru_utime.tv_sec)* 100+ 
(rl->ru_utime.tv_usec-r0->nu_utime.tv_usec)/10000+ 
(ri->ru_stime.tv_sec-r0->ru_stime.tv_sec)* 100+ 
(rl->ru_stime.tv_usec-r0->ru_stime.ty_usec)/10000; 
ms = (e->tv_sec-b->tv_sec)* 100 + (e->tv_usec-b->tv_usec)/10000; 
#define END(x){ while(*x) x++:} 
cp = “%Uuser %Ssys %Ereal HP %Xi+%Dd %Mmaxrss SF+%Rpf %Ccsw"; 
for (; *cp: cp++) { 
if (*cp !='%') 
*outp++ = * 
else if (cp[1]) switch (cp) { 
case 'U': 
tvsub(&tdiff, &r1->ru_utime, &r0->ru_utime); 
sprintf(outp,”%d.%01d", tdiff.tv_sec, tdiff.tv_usec/100000); 
END(outp); 
break: 


case 'S': 
tvsub(&tdiff, &r1->ru_stime. &r0->ru_stime); 
sprintf(outp,”%d.%01d", tdiff.tv_sec. tdiff.tv. —usec/100000); 
aire 

due 
psecs(ms / 100, outp); 
END(outp); 
break; 


case 'P": 
sprintf(outp.”%d%%", (int) (t*100 / ((ms ? ms : 1)))); 
END(outp); 
break; 


case 'W': 
i = 1rl->ru_nswap - r0->ru_nswap: 
sprintf(outp."%d", i); 
END(outp): 
break; 
case "X’: 
sprintf(outp,"%d", t == 0 ? 0: (rl->ru_ixrss-r0->ru_ixrss)/1); 
END(outp): 
break; 
case 'D’: 
sprintf(outp,"%d", t== 070: 
(rl->ru_idrss+r1->ru_isrss-(r0->ru_idrss+r0->ru_isrss))/t); 
reba 


Pein 








sprintt(ourp."%d". taz0?0: 
((rl->ru_ixrss¢r1-Sru_isrss+r]->ru_idrss) - 
(r0->ru_ixrss+r0->ru_idrss+rU->ru_isrss))/t): 


sprintf(outp."%d", rl->ru_maxrss/2); 
END(ourp): 
break: 


case F:: 


sprintf(outp.”%d", r1->ru_majfit-r0->ru_majflt): 
END(outp); ° 
break; 


case ‘R’: 


sprintf(outp.”%d", rl->ru_minflt-r0->ru_minflt); 
END(outp): 
break: 


case 'T': 


sprint{(outp,”%d", r1->ru_inblock-r)->ru_inblock); 
END(oupp): 
break; 


case 'O’: 


sprintf(outp,"%d", rl->ru_oublock-r0->ru_oublock); 
END(outp); : 
break; 


case 'C’: 


} 
} 


sprintf(outp,”%d+%d", rl->ru_nvcsw-10->ru_nvcsw, 
rl->ru_nivcsw-10->ru_nivcsw ); 


END(outp); 
break; 


> 


*outp = 0"; 


} 


Static void 
tvadd(tsum, 10, tl) 
struct timeval *tsum, *t0, *t1; 


{ 


tsum->tv_sec = t0->tv_sec + tl->tv_sec; 
tsum->tv_usec = t0->tv_usec + t]1->tv_usec; 
if (tsum->tv_usec > 1000000) 
tsum->tv_sec++, tsum->tv_usec -= 1000000; 


} 


Static void 
tvsub(tdiff, t1, t0) 
struct timeval *tdiff, *t1, *t0; 


{ 


(diff->tv_sec = tl->tv_sec - t0->tv_sec; 
tdiff->tv_usec = tl->tv_usec - t0->tv_usec; 
if (tdiff->tv_usec < 0) 

tdiff->tv_sec--, tdiff->tv_usec += 1000000; 








} 
Static vuid 
psecs(l.cp) 
long |: 
register char *cp; 
{ 
register int i; 
i= 1/3600, 
if (i) { 
sprintf(cp."%d:". i): 
END(cp); . 
iz] % 3600: 
sprintf(cp,” %d%d", (1/60) / 10, (i/60) % 10); 
END(cp); 
} else { 
izk 
sprintf(cp."%d", i / 60); 
END(cp): 
} 


i Kz 0: 
2 = 
sprintf(cp."%d%d", i / 10, i % 10); 
} 
#endif 
la 
* NREAD 
*/ 
Nread( fd. buf, count ) 
{ 
struct sockaddr_in from; 
int len = sizeof(from); 
register int cnt: 
if( udp ) { 


eM 


cnt = recvfrom( fd, (char *) buf, count. 0. (struct sockaddr *) &from. &len ): 


} else { 
if( b_flag ) 


cnt = mread( fd. buf. count ):/* fill buf */ 


else 

cnt = read( fd, buf. count ): 
} 
retumcnt); 


= NWRITE 


again: 


cnt = sendto( fd. (char *) buf. count. 0. (struct sockaddr *) &sinhim, 
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sizeof(sinhim) ): 
if( cnt<U && ermo == ENOBUFS ) | 
delay( 18000); 
ermo = U: 
goto again: 
} 
} else { 
cnt = write( fd _vunt ): 
} 
return(cnt); 
) 
delay(us) ; . 
{ 
struct timeval tv; 
tv.tv_sec = 0; 
tv.tv_usec = us, 
(void)select( 1, (fd_set *)0, (fd_set *)0, (fd_set *)0, &tv ); 
retuny 1); 


MREAD 


a rr toed 


* This function performs the function of a read(II) bur will 
* call read(I) multiple times in order to get the requested 
* number of characters. This can be necessary because 

* network connections don't deliver data with the same 

* grouping as it is written with. Written by Robert S. Miles, BRL. 
*/ 

int 

mread(fd, bufp, n) 

int fd; 

register char*bufp; 

unsignedn; 


{ 

register unsignedcount = @; 

register inmread; 

do { 

nread = read(fd, bufp, n-count); 

if(nread < 0) { 
perros(“tcp_mread"), 
renumn(-1); 


} while(count < n); 
return((int)count); 
} 
#if sgi 
static void 











tvsub(tdiff, 1, t0) 
struct timeval *tdiff. *t1, *t0; 
{ 
tdiff->tv_sec = tl->tv_sec - t0->tv_sec: 
tdiff->tv_usec = tl->tv_usec - tJ->tv_usec: 
if (tdiff->tv_usec < 0) 
tdiff->tv_sec-. tdiff->tv_usec += 1000000: 
} 
#endif 
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APPENDIX B: RCP PROGRAM 


#include <stdio.h> 
#include <sys/time.h> 


long elapsed_sec.  /* Seconds variable */ 
elapsed_usec; /* Mictoseconds variable */ 


int loop_counter, 
a, /* Subroutine result variables */ 


int nx 5; 


char name[30], system_name[30]; 
char rcp_string[30] = “rep; 

char blank_string[2] = ° "; 

int true = 1; 

char answer{2]; 

char* get_name(char “string); 


/* Variable structure defns */ 


struct timeval timestart, timedone; 
Struct timezone zonestart, zonedone; 


/* Get file name & Dest machine name & path = */ 


printf("\n\n\n Here is a list of availble files for transfering: \n\n"); 
system (“Is -al"); 


aaa t= 'y') 
printf("\n Input the file name to be transfered: \n\n"); 


gets(name); 
prinef(\n Is the below input correct? Enter y if yes or n if incorrect: \n\n"); 











puts(name); 
printf("\n"); 
gets(answer): 


answer(Q] ='n'’: = /* reset for next loop */ 
/* Get file size */ 


while(answer{0) != ‘y') 
{ 
printf(‘Nn Input the file size to be transfered: \n\n"): 
scanf("%d", &file_size); 
printf("\n Is the below input correct? Enter y if yes or n if incorrect: \n\n"): 
printf("%d\n", file_size); 
gets(answer); 
gets(answer); 
answer(0] ='y'; 
} 


answer(0] ='n'; /* reset for next loop */ 


while{answer{0] != 'y') 

{ 
printf("\n Input the Dest machine name & path to be transfered: \n\n"); 
printf(“An example would be: gold-fddi:/usr/test/wtog_test\n\n"); 
gets(system_name):; 
printf(N\n Is the below input correct? Enter y if yes or n if incorrect: \n\n"); 
puts(system_name); 
printf(“\n"); 
gets(answer); 

} 


strcat(rcp_string, blank_string): 
strcat(rcp_string, name); 
strcat(rcp_string, blank_string); 
strcat(rcp_string, system_name); 


/* Set up outer loop to execute transfers n times */ 
for (loop_counter = 1; loop_counter <= n; loop_counter += 1) 
{ 
/* Get start time in sec&usec and check if successful */ 
a = gettimeofday(&timestart, zonestart); 
if (a != 0) 
printf ("Oops ! Sd\n", a); 
/* Use system call to do file transfer */ 
system (rcp_string): 
/* system (“rcp american_pie.au gold-fddi:/usr/test/wtog_test"); */ 
/* Get stop time in sec&usec and check if successful */ 
b = gettimeofday( &timedone, zonedone): 








if (b != 0) 

printf ("Oops! %d\n". b): 

/* Get structure values for calculations. */ 
elapsed_sec = tmedone.tv_sec - timestart.tv_sec: 
elapsed_usec = timedone.tv_usec - timestart.tv_usec: 

/* Make sure that we account for the usec */ 

/* variable rooling over (through zero) */ 

if (elapsed_sec >= 1 ) 
{ 
if (elapsed_usec < 0) 


{ 
elapsed_sec -= 1; 
elapsed_usec += 1000000; 


} 
} 

/ Convert the usec variable to a floating point number. */ 
part_usec = elapsed_usec/1.0e6: 

/* Add the seconds to the microseconds to get a real number */ 
total_time = elapsed_sec + part_usec: 

/ And print the results on the CRT */ 
printf (“%f\t% fin", total_time, ((file_size*8/total_time)/1000000)); 
average_time =+ total_time: 
} 
/* Print out the results of the avg transfer rate */ 
printf(‘\n\nls this time correct? %f", average_time); 

printf("The average time was %f and the average twansfer rate was %f\n", average_time/n. 
((file_size*8/total_time)/1000000)); 


[* This is the end of the control loop. */ 


exit (0); 
} 








APPENDIX C: NEAL NELSON BENCHMARK RESULTS 


TABLE 9: CPU SUBSYSTEM 


GOLD2.SOL Gold 
CPU Type | Sparc | Sparc 
CPU Clock Speed | 45MHz | 50 MHz 
Total Size of Main Memory 224 Mbytes | 224 Mbytes 
Speed of Main Memory Chips | 80ns 80 ns 
Type and Speed of Math Coprocessor | None | None 
Number of Main CPUs 2 2 
TABLE 10: DISK SUBSYSTEM | 
Total Number of Disk Controllers _ ae 1 
Total Number of Disk Devices a oe ee: 
“Disk Drive Type SCSI 
Disk Drive Brand/Model Seagate 
Disk Average Seek Time me 
Seagate ST11200 2-10.5 ms 
Seagate ST1480 
Does system have 1/O buses separate from the Yes Yes 
main bus? : 
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TABLE 11: CACHE INFORMATION 


Does the system have instruction or data cache? Yes Yes 


How many levels of instruction/data cache are 2 2 
there? 






How is cache coherency accomplished? Snooping Snooping 
with with 
invalidation | invalidation 
Does CPU have separate instruction and data Yes Yes 


caches? 


Total size of all instructions/data caches: 
On-board Instruction | 20 Kbytes 20 Kbytes 


Data 16 Kbytes 16 Kbytes 
(Note: External SuperCache controller provides 1 
Mbyte external cache) 
Total swap approx 280 | approx 280 
Mbytes Mbytes 


Group 1: Tests a of mix of activities that are intended to approximate the processing 


activities for the following five types of users. Group | includes the following tests: 


1) Simulated Office Automation Workload 

2) Simulated Database Workload 

3) Simulated Software Development Workload 

4) Simulated Transaction Processing Workload 

5) Simulated Calculation Workload (Math/Statistics/CAD/CAM) 


Group 2: Tests designed to perform various types of calculation tasks and thereby 
profile the performance of the computer’s calculation subsystem. Group 2 includes the 


following tests: 
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6) Write to Shared Memory 

7) Read from Memory, Small Instruction Area, Small Data Area 
8) Read from Memory, Small Instruction Area, Larger Data Area 
9) Read from Memory, Larger Instruction Area, Smal) Data Area 
10) Read from Memory, Larger Instruction Area, Larger Data Area 
11) Make Machine Page or Swap with ‘malloc’ and ‘free’ 

12) Combined Integer and Flaating Point Math 

13) Math Library Functions 

14) Semaphores, Shared Memory, Context Switch 

15) Write to and Read from Pipes, Context Switch 

16) Sample System Calls 

17) Increasing Depth of Function Calls 


Group 3: Tests that perform a series of disk input and output functions to profile the 


performance of the disk subsystem. Group 3 includes the following tests: 


18) 1024 byte Sequential Reads from Unix File(s) 

19) 1024 byte Sequential Writes from Unix File(s) 

20) 8192 byte Sequential Reads from Unix Files(s) 

21) 3192 byte Sequential Writes to Unix File(s) 

22) 4096 byte Synchronized Reads from Unix File(s) 

23) 4096 byte Synchronized Reads from Raw Device(s) 
24) 16384 byte Synchronized Reads from Unix File(s) 
25) 16384 byte Synchronized Reads from Raw Device(s) 
26) 4096 byte Pseudo Random Reads from Unix File(s) 
27) 4096 byte Pseudo Random Reads from Raw Device(s) 
28) Profile Disk Cache for Unix File(s) 

29) Profile Disk Cache for Raw Device(s) 

30) 8192 byte Sequential Writes then ‘sync’ 
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(sold Verses White, Two Prucessurs 


TABLE 12: GOLD2.SOL VRS WHITE2.SOL, TEST 1&2&3&4 


White | Gold | Sian White ]| Gold | 
Secs 





TABLE 13: GOLD2.SOL VRS WHITE2.SOL, TEST 5&6&7&8 
I | | CS 


eee [secs secs 1 [ "sess sees [_['sees | secs |_[['sees | Sec 


Se a On a 
















TABLE 14: GOLD2.SOL VRS WHITE2SOL, TEST 9 & 10& 11 & 12 


ec sof [eel se 








TABLE 15: GOLD2.SOL VRS WHITE2.SOL, TEST 13 & 14 & 15 & 16 
ee 





9S 











TABLE 16: GOLD2.SOL VRS WHITE2.SOL, TEST 17 & 18 & 19 & 20 


[ele] (ell Peeled Pelee 





TABLE 17: GOLD VRS WHITE2.SOL, TEST 21 & 22 & 23 & 24 
oe ee ee eee eee 











TABLE 18: GOLD2.SOL VRS WHITE2.SOL, TEST 25 & 26 & 27 & 28 
eee 


a Sees | Ses [Ses] Secs |_| Seo | Secs 








TABLE 19: GOLD2.SOL VRS hibits TEST 29 & 30 





97 





Gold One Processor Verses Gold Two Processors Results 


TABLE 20: GOLD1.SOL VRS GOLD2.SOL, TEST 1&2&3&4 
a 2 A 


en [sess | Secs |_[ ses [sees |_| Sees sees cs 


a 
























TABLE 21: GOLD1.SOL VRS GOLD2.SOL, TEST 5 & 6 & 7&8 
a a a ff testy yf rete 


es sol [eetset Peete] Teel se 























TABLE 22: GOLD1.SOL VRS GOLD2.SOL, TEST 9 & 10 & 11 & 12 


Se ee | Se EE Se See 


LW secs [sex | 


GoKizy 





TABLE 23: GOLD1.SOL VRS GOLD2.SOL, TEST 13 & 14 & 15 & 16 
Te ests ests oT Tests aT RET 


[secs [secs teat eI | [secs | seo [se 














TABLE 24: GOLD1.SOL VRS GOLD2.SOL, TEST 17 & 18 & 19 & 20 


Lima feed Kel TST eet 





TABLE 25: GOLD1.SOL VRS GOLD2.SOL, TEST 21 & 22 & 23 & 24 
PS | 2 | | | | =< ee Filet | aia 


rane L's [sees |_ sees Lec |_| Sos [Sees |_I[ Ses Sec 


ee 


















ee eS So Se SS oe 
ee 
et ee da 8 
ES | SS ee 
SRR LS <= S S 2 
CLA A LL A 2 tS 
ee eee ee ee ae dS 
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TABLE 26: GOLD1.SOL VRS GOLD2.SOL, TEST 25 & 26 & 27 & 28 
ee ee ee ee ee eee 


rae [ses I se ccs | Pe Tse | [ee Ten 





TABLE 27: GOLD1.SOL VRS GOLD2.SOL, TEST 29 & 30 
a | S| a 


is a3 mE 
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Solaris 2.3 One Prucessur Verses SunQS 4.1.3 One Prucessur Results 


TABLE 28: ih ibicetorhiiee VRS eta lage TESTI&2&3&4 


















Ti 





ee ee feat aerate extimnadine 







= | el Kika T_f a et 









al Seo Hl sees | sees | os ff Sect [Secs UL Secs | Sex| 
SE Ee = 
A 








a:: 2 MO): SS,” = | 
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TABLE 30: GOLDI.SOL VRS GOLDIL.SUN, TEST 9 & 10 & 11 & 12 





TABLE 31: GOLDL.SOL VRS GOLDIL.SUN, TEST 13 & 14 & 15 & 16 
MR | | | A a | | st | | 


md see [seo [sex [sex | [ses [sex | Tse Tse 
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TABLE 32: GOLDP1.SOL VRS GOLDLSUN, TEST 17 & 18 & 19 & 20 





TABLE 33: GOLD1.SOL VRS GOLD1.SUN, TEST 21 & 22 & 23 & 24 
en ee eee eee eee 





104 


TABLE 34: GOLD1L.SOL VRS GOLDLSUN, TEST 25 & 26 & 27 & 28 
——EE————————_———— ee 


Bited Bove = 30 3010 
Secs = 


































. A 
ee ee oc oe SS 
Re  ) ” S|” * s+ Ss 





TABLE 35: GOLD1.SOL VRS GOLD1.SUN, TEST 29 & 30 





105 











APPENDIX D: NTTCP SINGLE PROCESSOR RESULTS 


TABLE 36: SINGLE PARAMETER TEST RESULTS 












; cst Whi old : : aR 
roviBoaeeviews} | | T 
FD oe Bie ae ee ee ee 


wt er 
Sima ds switched 
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To: 


To: 


TABLE 37: SINGLE PROCESSOR, 1ST TEST RESULTS 


ase feed ee Sif ee 


Gold 


Gold 
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LLC Buffers: 48K 
Single Test 


LLC Buffers: 48K 
Single Test 


LLC Buffers: 48K 











TABLE 40: SINGLE PROCESSOR, 4TH TEST RESULTS 


‘ 


From: Gold : LLC Buffers: 48K 
To: White 


LiC Buffers: 48K 
Single Test 


LLC Buffers: 48K 
Single Test 
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TABLE 43: SINGLE PROCESSOR, 7TH TEST RESULTS 


nhdow Size 
“ik byt ree _| as | | es |e | ie | | pe | Me 


LLC Buffers: 48K 
Dual Test 


From: Gold : LLC Buffers: 48K 
To: White 





LLC Buffers: 48K 
Single Test 















TABLE 46: SINGLE PROCESSOR, 10TH TEST RESULTS 


HOW size 
"ik bytes Lae | nope | Mie | oe | pe | Me | es | Me 


From: Gold ads: LLC Buffers: 48K 
To: White : Sms Single Test 


LLC Buffers: 48K 


From: Gold : LLC Buffers: 48K 
To: i 





110 





TABLE 49: SINGLE PROCESSOR, 13TH TEST RESULTS 


ee ee 


: LLC Buffers: 48K 
To: Gold nM Single Test 


LLC Buffers: 48K 
Single Test 


LLC Buffers: 48K 
To: Gold 
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TABLE 52: SINGLE PROCESSOR, 16TH TEST RESULTS 


From: Gold ads: LLC Buffers: 48K 
To: White 





LLC Buffers: 48K 


To: Gold : Single Test 


TABLE 54: SINGLE PROCESSOR, 18TH TEST RESULTS 


or rr se as rT 
: : U 6.25 "3 . 


LLC Buffers: 48K 
Single Test 
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TABLE 55: SINGLE PROCESSOR, 19TH TEST RESULTS 


—ecercy_| wre | ae | ge | i | pe | ee | ge | ae 


: LLC Buffers: 48K 
TIRT: tims Dual Test 


LLC Buffers: 48K 
TIRT: i11ms Dual Test 


Threads: 16 LLC Buffers: 48K 
TTRT: i1lms Single Test 
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TABLE 58: SINGLE PROCESSOR, 22ND TEST RESULTS 


induw 
—eores_| wpe | Mis | pe | ie | pe |e | te | Me 


From: Gold Threads: 16 LLC Buffers: 48K 
To: White TIRT: tims Single Test 


TABLE 59: SINGLE PROCESSOR, 23RD TEST RESULTS 


LLC Buffers: 48K 


LLC Buffers: 48K 
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TABLE 61: SINGLE PROCESSOR, 25TH TEST RESULTS 


we D 


LLC Buffers: 48K 
Single Test 


TABLE 63: SINGLE PROCESSOR, 27T' ST RESULTS 


LLC Buffers: 48K 
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TABLE 64: SINGLE PROCESSOR, 28TH TEST RESULTS 


LLC Buffers: 48K 
Dual Test 


LLC Buffers: 48K 
Single Test 





LLC Buffers: 48K 
Single Test 








TABLE 67: SINGLE PROCESSOR, 31ST TEST RESULTS 


LLC Buffers: 48K 


LLC Buffers: 48K 
Single Test. FDDI Boards Switched 





117 








TABLE 70: SINGLE PROCESSOR, 34TH TEST RESULTS 


LLC Buffers: 48K 
Single Test, FDDI Boards Switched 
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APPENDIX E: NTTCP TWO PROCESSORS RESULTS 


TABLE 71: PARAMETERS USED FOR TWO PROCESSOR TEST 








est Number Cragg sbf_ mtu 


SSO eae 
—aeter—[wis [a 1] ae 
[sete | Whe | God | | Sms | ak | 5 
[auton | Whe] Go | 16 | Sm] Kk | 02 
[sate | Whe | God [8 | tim | @K_| 052 
[—enTee | Whe | Gold_[ 16 | tims | ok | aso 
[rate Whe | Gold] 8 | me | 4K | 02 
[sTest | Whe | Gold_[ 16 | 25s] AK | 352 
[ater | Wie | Goa 8 | tm | sok | aan 
[toner [Whe [Gold [16 ts KC 
[Test [White [Gold [Ss [oR 
[ran Test [White [Gold [ 16 Se | SOK 
[amet | Whe [Gold] * | tims | SOK | 52 
[ran Test—_[ Whe [Gold 16 time | SK | 
[tsrTest [Write [God 8s [aK 
[tomTest | Wre [God] 16 | me] SOK «YC 
[tin Test [Whe | Gold [8 | Ome | «OK | as 
[rane [ Ware [God [6 ite | KC 
[tomtest_[ Whe [Gold 8 id Se | OK «YC 
[20m Test | Whe | Gold] 16 | Sms] 4K ast 
EC 
a LC A 
ster [ Wine [God] 8 | ms | OK «YC 
[2a Test | Whe [Gold] 16 | ms] 40K 52 
EA A 
[zener | Whe | Gold] 10 | Sms] aK] aim 
[27 Test_[ Whe | Gold | 8 | See | aK | air 
[—2aivTest | _Wrie | God] 16 | Sms_| aK | _aimz_| 
[20h Te | Whe | God | * | tims | aK | 412 _! 
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TABLE 71: PARAMETERS USED FOR TWO PROCESSOR TEST 


rom: I NFS_asyaoch TTRI sbf_num_ 
Pe Ee ee 
Paiste Wane] coeds aK Ye 
SindTest [Wate [Gold] 10] 5ms_[ aK | av? 
[sade | White [Gord [8 [aes SK ae 
EA A 
[asm White [| Goa] _8 [Sms] 50K] a092_ 
[—senter [whi [Goi [te [Sms] 0K [av 
ES OO 
[sate Wie [Gorse | tims sok ais] 
[aan Test [White [—Gond_[ [5m] 30K ais 
aon Tess [Wine [Gots Sms SK 
aise wrie Gott] | oes Ka 
[andes Was] Gana [te] ams Ka 
a 
—aaerTest [Whe | Gor [ 10 | Sms | ok asa] 
Bint [ Whe [Goi [8 [tims | OK] aa 
[ae Test [White | Goig[ 16] time | 0k | at92 
A 
[—“aein Test | White] Gord [16 _| 25ms | 0K | a1m2__ 





























TABLE 72: TWO PROCESSORS, 1ST TEST RESULTS 


=e She [te eae ee See 


Sa. | 5097 | 5243 | Soe | 51.36 _| 

| 746.38 | 5461 | 5825 | sae | a9.s2_ | 5243 | s2e7 | S288 
Lt 05.33 | coo7_ | sacn | 55.4 | so | sane fs247 | 532 | 
ee ee ee ee 
ps2 ef se 5097 | 5680 | 480s | 5163 | 5200 | 5228 
Lo Sona | saci | soo7_} sai | soo | 4a71 | sovo | 47.17 | 


From: White 
To: Gold 
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TABLE 73: TWO PROCESSORS, 2ND TEST RESULTS 


Window Size fe D le H 

(Kbytes) | Mbps | Mops | Mbps | Mbps | Mbps | Mbps | Mbps | Mbpe 
er ere [ss as fae er | 
se [ew [095 ss | ome oar] | 
ar | mt | or | oe? | om | sie | sue | am | 
a orf sect | sie | sos? | sues | er | ste | 
en | ear | ais | x | om | ae | an | 0 
[rar [ais [ae | oo [as | nn | ss 
[2 | mon [soor_| sar | san | ato | sno | ae | sa 
PO 82277 Taos | 4369 | S097 | 4952 | 49.25 | s2i0 | 4491 | 


LLC Buffers: 48K 
MTU: 4352 Bytes 


TABLE 74: TWO PROCESSORS, 3RD TEST RESULTS 


ndow Size File G 
a bytes) Mbps dal ies hed Mbps Mbps 


er er er fn | sos fas |e || 
SS OE EA 
[ear [sas [Sas sae] sos | anse | 5007 _| 


a | a | es | ea | ae | se] so | sia 
an | ae [as [| ie | ef a | a [san | 
P| no oor seas [sano | aosa_| sien | sas [S207 | 
598] ers] S06 |__| S008 [ase | s070 | 
O95 Sol {S097 | 5680 | 49.20 | 47.79 | 4475 | seal | 


LLC Buffers: 48K 
MTU: 4352 Bytes 


TABLE 75: TWO PROCESSORS, 4TH TEST RESULTS 


Window Size le A B | FileC le D ile E | File ileG | FileH 
(K bytes) Mbps | Mbps | Mbps | Mbps | Mbps | Mbps | Mbps | Mbps 
Ce | er | en | en | aie | se [ies | is 
rd er | er | ose | sue | soe | sao [sav | 3195 | 
Pi eee | saci | som? | som | som | sacl | sam | sas | 
iY saat | sas seer | ssie | aes [sass [sia | 
rides [ose | saat | som | sos | sae | ssm0 | saat 
er [coer seas [sane [saa [sis [sam [5350 | 
PS «dt Ceo fase [seas soa0 [anos | soae [soar [e223 | 
To +i inn | aos | saor | se%8 | «800 | sos | a7sa | «671 | 


LLC Buffers: 48K 
MTU: 4352 Bytes 
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TABLE 76: TWO PROCESSORS, STH TEST RESULTS 


File A ileB ] File C 
| aren Se eae 

Sn ee 
ee ee TC TS eR 
[er se | en | er | ee | ae | ee | nee | 
Tp ef a | ae [on | ee | ae [ee | ao 
sss [as [seer] seer | oe | sar | sao | st 
P3277 seer | 58.25 | S401 [5212 | 47.sa [| 46.37 [45.0 


LLC Buffers: 48K 
MTU: 4352 Bytes 


TABLE 77: TWO PROCESSORS, 6TH TEST RESULTS 


indow Size le G 
See eee fee ee eee 


Pari | woes | mse [sass | ze | es | sen | 00 | 


| [sor sae | 
[rene ear [sear] act [sive | a1] | 
pf 13653 fais | Saor ] 55.36 | 5097 | 4952] 4777 | ania | 


LLC Buffers: 48K 
TIRT: iims MTU: 4352 Bytes 


TABLE 78: TWO PROCESSORS, 7TH TEST RESULTS 


Window Size ile 
(K bytes) Mbre Mbps Mbps Mope Mbps 


Se Be a a ee 

EC Ee 
pe oof ars sate | 49sz | 51635200 7 so2t_| 
pe tm seer 47.33 | saor | ansz | saa [siz [52.25 
[asics | ats [473s | saci | soo | ssse [sits s27i_ | 
p82 ane | sor | sae | sos7 | sas7_ | 5399] 51.86 | 
LS ao | coor | seer | 528 | nse | 4723 | soar | ee | 
p27 |] 80.07_—|] Sao | 56.80 | 484s | 48.03 | 4320 | 4433 _| 


Threads: 8 LLC Buffers: 48K 
TIRT: 25ms MTU: 4352 Bytes 
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TABLE 79: TWO PROCESSORS, 8TH TEST RESULTS 


Window Size le C File G ile 
| byes) | Mbps | Mbps | Mbps | Mbps | Mops | Mbps | Mbps | Mbps | 
ee ee ee ee 
Gee 
a ES A ED 
Pt 9557 S461 | 5825 | Seeo | saor | sisi | $385 [5310 | 
P8277 3.69 | 50,97 | 5808 | sate | soos | s260 | $4.26 | 
pt tots seer | som | seer | 4993 J] sz7e 347 [53.40 | 
pS 3277 corso | sate [4556 | soi [4966 [53.37 | 
PO 313.12 60.07 | 47.33 | S461 | 4452 | Sio2 7 51.26 [47.16 | 


Threads: 16 LLC Buffers: 48K 
TTRT: 2Sms MTU: 4352 Bytes 


TABLE 80: TWO PROCESSORS, 9TH TEST RESULTS 


Window Size FileB | FileC F | FileG | FileH 
(K bytes) Mbps | Mbps Mbps bl Mbps | Mbps 


ee 


18.33 6007 | 5097 | S4o1 | 
p84 {5461 | seer | Sate | seas | 3562 | 4079 | 38.74 | 
p34 {49.5 | sao | 5243 [30.22 | 2643 [2845 | 24.03 | 


From: White Threads: 8 LLC Buffers: 56K 
To: Gold TIRT: 8ms MTU: 4352 Bytes 


TABLE 81: TWO PROCESSORS, 10TH TEST RESULTS 


Window Size le A | File B Cc . F f ile G ile H 
| yee) | Mops | Mone | Mbps | Mbps | Mine | Mbps | Mbps | Mbps 
er ar ar es | oie | sos | ear | 2 | 
a [sais ss | oe | |e | ar | 
[8633 or] sas] sem | sos | sae | ses | siz 
aes oor [sie | ees | sae | sass] sess | se 
[iss [ose sect | sem [sas | saw | sz | sao | 
erat [semo | ass | sim | sar | 32 | 
ss [es | oe | er | os | as | as | ae | 
Pt 33 sts 47.33 | 5680 | 35.46 | 36.32 | uae | 30.52 | 


From: White : LLC Buffers: 56K 
To: Gold : ; MTU: 4352 Bytes 
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TABLE 82: TWO PROCESSORS, 11TH TEST RESULTS 


vi) | ws | Mis | ts | bs | ts | bs | ts | me 
ee 
Se a a ee 
ee ee ee Oe Ce 
an oar ass [see | seo] so] sos] sons | sto 
asia Teor suas | sou | sac [soa | sor sae 
nie ais [set [eas [ee [ee sie | 
3s aes et [ee | a [ee [on | ax 
PO 1653 Ton? | seer | Soo ] 46.78 | 3239 J 39.31 | 27.51 | 
LLC Buffers: 56K 
MTU: 4352 Bytes 


TABLE 83: TWO PROCESSORS, 12TH TEST RESULTS 


indow Size le D 
a ne Mbes Mbps 


[s095_| ses [sass] sou | sw | _sua?_| sau | 
ae eee ef ae te 
esos | ses | aoe [some [ste so | 


a eae | ear [ow | ee | se | sm | 00 | 
[x 2 [ae | as | a | on [sia | ear | a | 
[aor [sae [st [oe [es | a | 
ia es | saree [oar |_ as | an | 
Po Ot 136.53 | 49s | 47.33 | $243 [38.77 | 35.34 | 27.04 | 28.61 | 


From: White Threads: 16 LLC Buffers: 56K 
To: Gold TIRT: Sms MTU: 4352 Bytes 


TABLE 84: TWO PROCESSORS, 13TH TEST RESULTS 


Window Size 

(K bytes) Mbps Mbps Mbps Mbps Mbps 
sd et | er | er | es | me | eo | ne | 3s | 
iy er | sas | sass | mo | som | sae | suis | wm 
Pts. aos | Saor | S461 | 4806 | 5340 | S206 | S140 | 
| 69 | 60.07 | 61.90 | 5243 | 48.06 | 5340 | S031 | S201 | 
| 18653 | 0.07 | Shor fF Gti? | 243 | 5243 | 5247 | 51.29 | 
pt 3a 4369 | S461 | 5316 | soz | soo | sos | 50.97 | 
ee ee Be Be Ee ee a ee ee 
LOC 29852 |p | som {sate | 3799 | 3091 | 2746 [26.73 | 


From: White Threads: 8 LLC Buffers: 56K 
To: Gold TTIRT: lims MTU: 4352 Bytes 
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TABLE 85: TWO PROCESSORS, 14TH TEST RESULTS 


fie C ite F 
eve _| is | es | ts |e | ts | ts | ts | te 
| 3277 {309s [ saso [sass] 3iss [site [sis [sts | 
eee ee a 
a ee eee ee 


Lo 3277 | so.97 {so | sos | 4239 | 3290 [31.35 _| 


: LLC Buffers: 56K 
To: Gold TTRT: Illms MTU: 4352 Bytes 


TABLE 86: TWO PROCESSORS, 15TH TEST RESULTS 


cre is | ts | ts | tie | ie | ts | ts | 


[oe [ sexs sar7_{ sass_{ sue | so | sie | 1003 | 
ee ce es ee Be 
eect [ne [ee [ace fans | sore | so | 
ee a BE 
BE BB 
(< EECIN EC E 
ee as | as | er] a | 
[| 1653 | aois_| saci | 3203 | 1206 


From: White Threads: 8 LLC Buffers: 56K 
To: Gold TIRT: 25ms MTU: 4352 Bytes 


TABLE 87: TWO PROCESSORS, 16TH TEST RESULTS 


| tress_| Me | Mine | Mins | ope | Mbps | tips | Mine | ots 


se oe ee 
ri | wo | sss | wos | ws7 | 2006 | wz | soma | 3035 | 
ro SC«dT ee | eer | som | suas | arm | om | som | 78 | 
Lr {323 | 5097 | seer | 49s2 | soe | 51.79 [49.47 | 
re CdT em | ois | som | som | saes | sous | sie? | 5090 | 
rT —C«Y os | aor | is | sale | som) | wus | ams | ast | 
pt ts. coor fo aois | soso | 3979] ssis | 404s | 39.56 | 
[7 me | on | nm] se [an | a] nis | mas | 


From: White Threads: 16 LLC Buffers: 56K 
To: Gold TTRT: 25ms MTU: 4352 Bytes 





TABLE 88; TWO PROCESSORS, 17TH TEST RESULTS 


Window Size File B ile H 
(K bytes) Mbps a 
a 
SOE LC 
pe 9557 saor | Saor [56.80 fave [5243 | soos | 5u.38 | 
pt 87563 | ants [Seo | 5243 | 4e0o | Soe | som | 49.55 | 
Pt 26761 fT ou.07 | su2s | soso | sees [36s [3152] 2v.14 | 
if sas mor asp | sue [tear | 100s | sas fer 
[ip en | sw | ssi | am | we | iow | iow | war 
re ee 
LLC Buffers: 40K 
MTU: 4352 Bytes 


TABLE 89: TWO PROCESSORS, 18TH TEST RESULTS 


Window Size 
[opens | ts | | ie | Ms | te 


ee ee ee oe 
a ee ee re eee 


[sie | as | ont | es | ae | 00s | ime | scm 
x __ [27 | an | ss | mor | so [sm | sam | um 
OE 
ae | as | a | ae | se | ee | ase | ao 
p27. 3186 40.78 9554 | 27.72 | 427 | 1s2 | 10.86 | 


LLC Buffers: 40K 
MTU: 4352 Bytes 


TABLE 90: TWO PROCESSORS, 19TH TEST RESULTS 


indow Size A i 2 tle D i 1 le H 
(K bytes) Mbps Mbps | Mbps | Mbps | Mbps | Mbps 
008 3277 3277 | asaa 3277 [3213 | 363 31.62 
P27 29s 27.31 292s | 32m | 3034 | 2008 | 30.33 | 
[ores [ser | mar | oe | os | oo | an | oo 
eT 0-0 | ats Sao | Seon | 4952 | 492s | so76 | 49.29 | 
if an | ae] mer | 0 | oe | me | am | 2990 | 
| 55.34 | | 7.01 | 


4352 Bytes 
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TABLE 91: TWO PROCESSORS, 20TH TEST RESULTS 


Window ‘ile A ile B le C D le G ile Hi 
SO EC BE WE 
er as [nao [2s sons [sia [seo 
[2 —S«d;=ass—| coor [seer | ss [ soo7 [saci | stm { sual 
[esas [eas [sear [ste [ses [sir [sae7 [san 
ES Bs 
T —«| «ese [coor [seas [sans] zat” | zave | zoes | 2am | 


LLC Buffers: 40K 
MTU: 4352 Bytes 


| 546i _ 
a ee ee eee 
ee ee ee ee 
ae 3738.23 50.97 | sao | 2070 | 2200 | 208s | 19.60 | 
a ae Ee eee Eee eee 
pO 82778641 38-23 | 39-43 | 28.27 |. 1650 | 57] 10.8 


From: White Threads: 8 LLC Buffers: 40K 
To: Gold TIRT: ilms MTU: 4352 Bytes 


TABLE 93: TWO PROCESSORS, 22ND TEST RESULTS 


Window Size ile B ile C | File D 1 1 1 H 

(Kyi) | Mope | Mbps | Mbps | Mbps | Mbps | Mbps | Mbps | Mbps 
P8277 32.77 | 359 | sass [3277 | 380 | 31.95 | 3186 | 
PT ese [30.85 [900 [20.73 [ 2925 [303s [3087 [3100 | 
Te | an os | sas [a | son | se | ss | sem | 
357.2 | coo? | ov | sao | 49s2 [ siss [5339 [5202 | 
ES EE 
Pi sess [own [sort [sais [aia | ism | ione | i700 | 
Pte [ase [woos [saz fam [isa [a0] nas | 
Pw ttf sz 37.87 fo s212_ [25.79 | iio [toss 7 10.27 | 

From: White Threads: 16 LLC Buffers: 40K 

To: Gold TIRT: ilms MTU: 4352 Bytes 
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TABLE 94: TWO PROCESSORS, 23RD TEST RESULTS 


(Ks Pee Pa ar ae ae ee ae 
en er er [as | en | es | 
a | ss ae fs | ee | 
se | | as | or | or | ae | soo | on 
ES CCI CA CC 
SN 
SS A A OE WA 
EO SCN BE EC EE 
Ee eX Ee CC 


From: White Threads: 8 LLC Buffers: 40K 
To: Gold TTRT: 25ms MTU: 4352 Bytes 


TABLE 95: TWO PROCESSORS, 24TH TEST RESULTS 


dow Size 
wren _| | is | ms | ot | ns | ie | ns | 


ee Ee 
ee 


LLC Buffers: 40K 
MTU: 4352 Bytes 


cm sie] 
Tass | ss | an | mas | mee | mos | B57 
|_16.12_ [27.03 | 25.75 | 3643 | 3419 | 2190 | 20.40 | 
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TABLE 97: TWO PROCESSO 26TH TEST RESULTS 


Window Size ile A tle G | File i 
Mbox | Mépe | Mbps | Mbps | Mbps | 
ear | met] ms | oot | eno | san | st | 
CS EE EE EO 
ane | ai] it sai ome] son] oes | 
anata rai [st | san] sos | oe | eae | 
iss ais [asst se | st [io | sao | st 
SS BEL EO 


LLC Buffers: 48K 
MTU: 4192 Bytes 


EK 
ee ee 
a ee 0 ee Ee 
PO 28.21 sss feo | 443 | 2091 f isos | 1750 | 1309 | 


To: Gold TIRT: Sms 


TABLE 99: TWO PROCESSORS, 28TH TEST RESULTS 


A cs BO 
ee 
sss [ear aat_[07 [ose soe | si | s | 
3 an | as | aa | ao | 208 | i | 2m | 2 | 


: LLC Buffers: 48K 
To: Gold an MTU: 4192 Bytes 
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TABLE 100: TWO PROCESSORS, 29TH TEST RESULTS 


Window Size le C ile G 

ete Mis | tn | ts | ie | te | | | ne 
ee oe | 36.67 35.9 35.37 | 35.6 35.98 
EY 
p20 toes | oor 47.33 | 5097 | 4sue Paar | sus? | 5202 
A 
er a es | se | oe | sf er fe 
ae ar [gs ee [ee [es [an] an 
RC OREO EE ED 
POT ttt 02) so 229s 9.72 F017 P2532 1862 | 


From: White Threads: 8 LLC Buffers: 48K 
To: Gold TIRT: 11ms MTU: 4192 Bytes 


TABLE 101: TWO PROCESSORS, 30TH TEST RESULTS 


| toy) | Me | wigs | Mins | Mive | Maps| Mtge | Me | Mtns | 


Tae [an | «os | xo | us | a7] mn | xa] 
ee ee 
ars [mor | ars | ee | ae | os | on | ee 


ee eee eee 


LLC Buffers: 48K 
TIRT: lms MTU: 4192 Bytes 


TABLE 102: TWO PROCESSORS, 31ST TEST RESULTS 


a ee Oe ee ee 
ee ee ee 
P20 ss sae | sos | 5243 | sos [sous | 5243 [S083 | 
Pe eT ats | so | S243 | S243 fsa | saat [5250 | 
| 63 ee | 3823 Seer | sate | soe | 5552] sais | $473 | 
ee eee 
PS SSS Tomo [3277 33 fF 3es [302 | 340 | 3001 | 
es eS ee 
From: White : LLC Buffers: 48K 
To: Gold es MTU: 4192 Bytes 
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TABLE 103: TWO PROCESSORS, 32ND TEST RESULTS 


Wiadow Si: 1 eB] FileC | FileD | Fi i ile G H 
| bys) | Mops | Mbps | Mbps | Mbps | Mbp | Mbps | Mine | Mbps | 
ssf arf a | se | ot | se |e | 38 | 
at |r [ae | ee | ee | ew | | mo 
oor as [a | ae | am es | 
is] oof as | oe | 20 | su | aa | a0 | 


LLC Buffers: 48K 
MTU: 4192 Bytes 


[as [i | sos [som | sac [sat | 
ES EA 
LO 238.67 38.23 | 37.14 | 4265 | 40.99 | 3717 | 39.92 | 35.78 | 


: LLC Buffers: 56K 
To: Gold : MTU: 4192 Bytes 


TABLE 105: TWO PROCESSORS, 34TH TEST RESULTS 


Revi ”_| te | Hone | is | | ne | ie | | oe 


ee ae Ee ce 

ee ree ee ee me 
[0 105.33 | coor | sz | 097] 5243 | 524s | S206 | 5223 | 
Ps | coor | sar | 5243 | she | saer | saz? | 5385 | 
p99 | Seon | 5097 | 5097 | sae | 5243 | S427 | 53.63 | 
pa 5.33 | ants | sosr | 5097 | 5097 | sie3 | so | 5263 | 
ae ee ee ee 
Pf 133.80_f 3277 _f 35.32 f 39.79 | 45.20 | 46.26 | 3752 | 34.26 


LLC Buffers: 56K 
MTU: 4192 Bytes 
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TABLE 106: TWO PROCESSORS, 35TH TEST RESULTS 


Window Size le C 
(Kbytes) | Mbps | Mops | Mops | Mbps | Mbps | Mops | Mbps | Mbp» | 
SE ER LY OE BE 
EC RETESET 
ess [as | sont [a [im | se [sie | 
OW ET 


LLC Buffers: 56K 
MTU: 4192 Bytes 


TABLE 107; TWO PROCESSORS, 36TH TEST RESULTS 


| ovis | Mtb | oes | Mins | Mine | Mine | Mtns | tb | ms 


+ a | nee | ee | eer | ee [sr |e | al 
Ce oe 
Es oe 
a 
a 


EC CS EL 
a 
Pisiss_| ae | sor] os] so | om | sim | sam | 
ars | ae | aa] on | oo | am | om | 20 | 
4 en | mat] ost | ose si | om | sim | se | 
ES OS CC 
Pt. | 3459 | 38.23 | 5405 | 3872 | 3083 | 427% | 36.43_| 


TABLE 108: TWO PROCESSORS, 37TH TEST RESULTS 


Ltr | we |e | te | Me | te | ie | ts | 


: LLC Buffers: 56K 
TIRT: tims MTU: 4192 Bytes 
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TABLE 109: TWO PROCESSORS, 38TH TEST RESULTS 


Window Size le A File C ile G 
(K bytes) Mbps Mbps a 


ee ee 
Ls 


pa Sots | asso | Seon | s2iz [243 | szer | sz03 | 53.40 | 
p52 tit z3 ono see eco 47.02 | 51.81 | soon | 40.42 | 
pf 2731_ | 52 fT 3240 4.92 [aes | 4043 | 3800 | 35.53 


: LLC Buffers: 56K 
TTRT: 1ilms MTU: 4192 Bytes 


TABLE 110: TWO PROCESSORS, 39TH uT-ST RESULTS 


a 
ee papa [aap ef oe 
|_ 13653 | saet_ | sdor_ | <nos | sos7 | sues | sase | 51.26 | 
UE Ee oe Be Re Re Ee 
ae ae EL Ee Re eA Ee ewe 
fare ee Ee Ee Ee Ee ee ee Ee 


p85 | oor | 4369 [ aeco | 4is2 | 4609 | so9s | 49.18 | 
[iar [0s [as | ao | aa] me | ae | 3657 


Threads: 8 LLC Buffers: 56K 
TTRT: 25ms MTU: 4192 Bytes 


TABLE 111: TWO PROCESSORS, 40TH TEST RESULTS 


Vindow Si fileA | FileB | FileC | FileD | Fi Fi fle G | File 

(K bytes) Mbps | Mbps | Mbps | Mbps Mbps | Mbps 
P3277 3823 aso 374s | 374s | 35.37 | 3660 | 36.58 | 
pts 2913 | 2403 [27.67 | 27.67 fete | 28.23 | 28.48 
P3277 | Sac | 5097 | 48.06 | ato | 5163 | 5087 | 524 | 
pov ss | ants | saei | soo7 | 5243 | 5340 | s247 | 5292 | 
Pts [38.23] saor | 5097 | 5170 | 5084 | 5399 | S2ae | 
Se eee eo 
P8277 oor ats [ase | ass 45.35] 46.70 | 48.13 | 
P23 3277 | 3298 Tansy | 4452 [35.98 [39.81 [35.67 | 


From: White : LLC Buffers: 56K 
To: Gold : 25ms MTU: 4192 Bytes 
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TABLE 112: TWO PROCESSORS, 41ST TEST RESULTS 


oe os | Mbps | Mb 

(Kbytes) ie i ee a 
SSR NO OCA EECA MEX A 
a | et | oe | ee | es | es | a | ae 
ee | ae | | oo | we | ea | ew | 
OREN EE OE I EW 
sd om et fr | ae | ae | oe | ae | 
i as | er | ap) ee | ee | mor | ee | Bw 
Pw. tos sas Tie. [26.67 25.20 | isas | 12.27 | 


LLC Buffers: 40K 
MTU: 4192 Bytes 


TABLE 113: TWO PROCESSORS, 42ND TEST RESULTS 


Window Size le A 
(K bytes) Mbps eds bode bd sine aed 
pt 827 3823 | 3459 fas | 36.67 | 36.20 | 36.30 | 36.60 | 


ee ee ee 
a ee ee 


ee ee 
ee ee 
pa sis 36.77 | 25.53 | ae | 268 | ene? fae | 23.80 | 
a ee ee ee ee 
EO {1636 fees {16.20 | 2667 | 25.20 | 1448 | 1227 _| 


TABLE 114: TWO PROCESSORS, 43RD TEST RESULTS 


erie | | ne | ine | Ms | ts | te | Me | 


WE 
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TABLE 115: TWO PROCESSORS, 44TH TEST RESULTS 


Ril | is | te | ie | ie | ts | te |e | 
ce eS ce 

SS ETO AS CTC BEE oe 
es ao | ae | ae | a | we | an] os] 
df aw | ae | we | ae | ome [| wo] wo | 
CC OE SO 
ae sas [00 ee | es | ae | om | 
[2 if eo | oa ozs [ons] rm {on | o7_| om | 
Pt es fas | 867 | ote | 7s fF 901 [695 | 827 | 


LLC Buffers: 40K 
MTU: 4192 Bytes 


TABLE 116: TWO PROCESSORS, 45TH TEST RESULTS 


Window Size A] FileB Cc D : File F ‘ile G H 
ar sear cor [ar | ss [8s | 
a ase | ae | se | | me | a | m0 | 
esses] son | ae sear | so | so | 
ae [ae |r| ae | ae | ae | a | 


TIRT: 1ims 


=o sear [_ 2 
+d tess | oor [ais | ame | oe | om | som | 5090 
SEC ETON ETO EST EO EO 
| an | ae | ae | ao | an | as | on | as 
095 eo |e | | a | 
EC EO OE BE CO 
Pp ts | tz | 2062 | 1774 masz oat [i707 [ 4se | 


LLC Buffers: 40K 
MTU: 4192 Bytes 
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TABLE 18: TWO PROCESSORS, 47TH TEST RESULTS 


Window Size le B | File C File G File H 
| vi” [ te | ns | ts | es | ms | ie | to | 
ss ear] ce [ew] 
cee EE BR 

Pes | ans] ee 
Se EEO 
a 
Psp as [ame [as | on [as [osm [oe [ 
ee A 
Pet P2206 YP s4u 7.36 wo Tae Tse 7 toay 


From: White Threads: & LLC Buffers: 40K 
To: Gold TTRT: 25ms MTU: 4192 Bytes 














TABLE 119: TWO PROCESSORS, 48TH TEST RESULTS 


Window Size File B 
(K stil wines: = | Mis | | on 
3277 


32.77 


LLC Buffers: 40K 
MTU: 4192 Bytes 


TABLE 120: TWO PROCESSORS, 49TH TEST RESULTS 


Window Size | File A Bi FieC 
(K bytes) Mbps Mbps Mbps Mbps Mbps sip 
er err sar as] ess] er] ss | | 
ea A 
ee a 
Ts | am [| uo | aa | aa | smo | 0 | si | 
a ee 
oss [oor [sat] sae [so | soar [sae | sim | 
2s | sas_|sonr_[ 28 | aes | a | ae | on | 
pO 18653 | 43.69 | 47.33 fi | 45.56 | 3685 | 37.14 | 29.40 | 


From: White Threads: 8 LLC Buffers: 48K Max Throughput Prediction Test 
To: Gold TITRT: 8ms MTU: 4352 Bytes 
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TABLE 121: TWO PROCESSORS, 50TH TEST RESULTS 


Window Size File A le C ile Hi 
(K bytes) Mbps Mbps te 
a ee 
Pp 30.04 3277 | 29.49 | Peep an [wees ae] 
p20 235.07 [60.07 | asst | 48.06 [4660 fe 4323 [4340 | anos | 
Pe 67.47 38.23 [47.33 | 48ve | abos | 4309 [45.30 | 46.27 | 
PY 27083 | 4369 | 4369 | 4952 [3797 | 3426 T3306 | 35.87 | 
P6058 60.07 | ane | seco [inst | isso J 1s35 [17.36 | 
PS 6.76 | a9.is | ano | assz [sez [762 787 | 80 | 
L__—60 Ss 209.35 | 3sso | 3386 | 34so [eas [ois fT s99 | 627 | 


From: Gold-SOMHz Threads: 8 LLC Buffers: 48K 
To: White-SOMHz TIRT: &ms MTU: 4352 Bytes 


TABLE 122: TWO PROCESSORS, 51ST TEST RESULTS 


[cori ie | ts | ts | ts | is | is | ts | es 
co Mbps | Mbps | Mbps Mbps 
Ec 

eee ea ae 
[id mar | cr | 007 | am | oe | an | am | an] 
«| (sas | or | mor | om | am | om | wa | am | 
SS BO BE SS 
de | as | ras] seas] suo | i500 | 1320 | az | 
a a A 
From: White-SOMHz Threads: 8 LL Buffers: 48K 

To: Gold-SOMHz TIRT: 8ms MTU: 4352 Bytes 
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ANSI 


ARPA 


ARPANET 


ASIC 
asynchronous 


bandwidth 


beacon 


DARPA 


APPENDIX F: GLOSSARY OF TERMS 


IEEE standard for the Logical Link Contol. 
Acknowledge. A network packet acknowledging the receipt of 
data. 


Address Resolution Protocol. A TCP/IP protocol to translate an IP 
address into a MAC address. 


American National Standards Institute. A private organization that 
coordinates some United states standards-making. Represents the 
United States to the International Standards Organization. 


Advanced Research Projects Agency. A Department of Defense 
agency that has helped fund many computer projects including 
ARPANET, the Berkeley version of Unix and TCP/IP. ARPA use to 
be known as DARPA. 


Advanced Research Projects Agency Network. A Department of 
Defense sponsored network of military and research organizations. 
Replaced by the Defense Data Network (DDN). 


Application-Specific Integrated Circuits. 


FDDI term for data transmission where all requests for service 
contend for a pool of ring bandwidth. 


The amount of data that can be moved through a particular 
communications link. FDDI has a bandwidth of 100 Mb/s. 


A token ring packet that signals a serious failure on the ring. 
Bit Error Rate. 
Bits per second. Transmission speed over some media. 


Comite Consultatif International Telegraphiqes et Telephonique 
(Consultative Committee for International Telephone and 
Telegraph). Standards-making body administered by the 
International Telecommunications Union. 


Defense Advanced Research Projects Agency. See ARPA. 
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DAS 


DDN 


DLL 
DMA 


DNS 


ICMP 


IEEE 


IGMP 





Dual Attached Stations. FDDI term for a node that is attached to 
both the primary and secor dary fiber optic cables (as opposed to a 
node that is connected tc ie ring via a concentrator or not dual 
attached. 


Defense Data Network. A network for the Department of Defense 
and their contractors based on the TCP/IP and X.25 networking 
protocols. 


Direct Memory Access. This is a device (controller) for controlling 
the trensfer of data directly to or from the memory without 
invol, jz the processor. The DMA controller becomes the bus 
master and directs the reads or writes between itself and memory. 


Domain Name System. A mechanism used in the Internet for 
translating names of host computers into addresses. The DNS also 
allows host computers not directly on the Internet to have registered 
names in the same style. 


Fiber Distributed Data Interface. A 100 M/bs fiber optic LAN 
standard based on the token ring. 


File Transfer Protocol. FTP is the Internet standard for file transfer. 
FTP was designed from the start to work between different hosts, 
runing different operating systems and using different file 
structures. RFC 959 is the official specification for FTP. 


Internet Control Message Protocol. ICMP is often considered part 
of the IP layer. It communicates error messages and other 
conditions that require attention. ICMP messages are transmitted 
within IP datagrams. RFC 792 contains the official specification of 
ICMP. 


Institute of Electronic and Electrical Engineers. A leading standard- 
making body in the United States, responsible for the 802 standards 
for local area networks. 


Internet Group Management Protocol. IGMP lets all the systems 
on a physical network know which hosts currently belong to which 
multicast groups. This information is required by the multicast 
routers, so they know which multicast datagrams to forward onto 
which interfaces. IGMP is defined in FRC 1112. 


139 














Internet 


IP 
ISO 
LAN 


LLC 


Mbps 


NAK 


OSI 


A collection of networks that share the same namespace and use 
the TCP/IP protocols. 


Internet Protocol. The network layer protocol for the Internet. 
International Standards Organization. 


Local area network. Usually refers to Ethernet or token ring 
networks. 


Logical Link Control. The upper portion of the data link layer, 
defined in the IEEE 802.2 standard. The logical link control layer 
presents a uniform interface to the user of the data link service, 
usually a network Jayer. Underneath the LLC sublayer of the data 
link layer is a Media Access Control (MAC) sublayer. The MAC 
sublayer is responsible for taking a packet of data from the LLC 
and submitting it to the particular data link being used. 


Media Access Control. This layer provides fair and deterministic 
access to the medium. 


Million bits per second. 279 bits of information (usually used to 
express a data transfer rate; as in, 1 megabit/second - 1 Mbps). 


Maximum transfer unit. The biggest piece of data that can be 
transferred by the data link layer. 


Negative acknowledgment. Response to nonreceipt or receipt of a 
corrupt packet of information. 


Network File System. A distributed file system developed by Sun 
Microsystems and widely used on TCP/IP systems. 


Network Information Service. Name service in the Sun Open 
Network Computing (ONC) family. 


Network Peripheral Inc. The manufacture of the FDDI interface 
cards used in this investigation on the Sun SPARC workstations. 


Nonreturn-to-Zero Inverted. NRZI is an example of differential 
encoding. In differential encoding, the signal is decoded by 
comparing the polarity of adjacent signal elements rather than 
determining the absolute value of a signal element. 

Open System Interconnection. 


Physical Connection Management. 
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PHY 


PMD 


PROM 
RARP 
RISC 


SMT 


SPARC 


SUN 


TCP/IP 





Physical Layer. PHY provides the media independent functions 
associated with the OSI physical layer. 


Physical Medium Dependent Layer. PMD specifies the 
transmitters, receivers and other associated hardware 


Programmable Read-Only Memory. 
Reverse Address Resolution Protocol. 


Reduced Instructioh Set Computer. Generic name for CPUs that use 
a simpler instruction set than more tradit »nal designs. The Sun 
SPARC workstation uses RISC technology. 


Station Management document. This layer provides the capability 
to monitor the FDDI network. SMT can provide services such as 
node initialization, bypassing faulty nodes and recovery. 


Scalable Processor Architecture. A reduced instruction set (RISC) 
processor developed by Sun and licensed by several vendors 
including AT&T and Texas Instruments. : 


Stanford University Network. This name was given for a printed 
circuit board developed in 1981 that was designed to run the UNXI 
operating system. 


Transmission Control Protocol/Internet Protocol. This is a common 
shorthand which refers to the suite of application and transport 
protocols which run over IP. These include FTP, Telnet, SMTP, and 
UDP. 


Token holding timer. Token ring and FDDI term for the amount of 
time a node can transmit data before sending the token back out to 
the ring. 


Target token rotation time. A term used in FDDI to set performance 
parameters. The TTRT serves as a measure of expected delay and is 
used, among other things, to set time-out parameters. 


User Datagram Protocol. 
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